==================

Better Call (Monte Carlo Tree) Saul

Another ‘whatcha been hacking on’ post

Inspired by the excellent work from the Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning - my research direction was to use the same approach but turn it into a simple game of argumentation: can we get a model to debate/argue like a lawyer? win arguments? defeat an opponent?

Law, unlike verifiable games like chess, is a game of persuasion. It has no verifiable outcome like mathematics. I can argue a correct thing - morally, ethically, whatever, and be ruled incorrect by a singular judge or jury. Obviously, when the tautological facts are on your side, you’ll win more than lose, but there’s a gamesmanship to the game. The goal is to convince the judge that your argument is correct (even if it’s not) - sort of like Noam Brown’s Cicero. It gets even more complicated when you argue to a jury, where the judge is not the only one deciding/directing the outcome, but in this over-simplified structure, we’ll just have a judge.

The first step is use an established framework to build a game of persuasion. Here, we’ll use Debatrix: LLM-based Multi-dimensional Debate Judge with Iterative Chronological Analysis, then we’ll use the Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning framework to create prompts in order to RL train a basic-b Mistral-7B to be a better debater/laywer.

In the vein of Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking, Chain of Logic: Rule-Based Reasoning with Large Language Models , and other general LLM reasoning (Entropix, etc), we want to move to a ‘think before you speak’ or more test-time compute (eg, quiet-star/o1/etc) but in the meantime, let’s see if we can build a dataset that will bake in some next-token prediction better-call-saul fundementals into this model in order to argue like a lawyer.

I won’t/can’t include all the code (the changes to DPO-MCTS or Debatrix, the trained model, the pipeline, or really anything super useful tbh - secret-sauce-nda-etc-etc) - but sort of want to share the thought process of how I got there; maybe the thought process will be useful to someone else.

Pseudo-code

here’s the pseudo/real-code we’ll use:

from transformers import (
    AutoModelForCausalLM,
    GenerationConfig,
    PreTrainedModel,
    PreTrainedTokenizerBase,
    get_scheduler,
)
from mcts_rl_testing.algorithms.mcts.mcts import (
    CourtRoomV2, 
    LawStepLMConfig, 
    SearchArgs, 
    LawMCTS,
    MCTSNode,
    MCTSConfig, 
    TreeConstructor,
)

offline = False
global_step = 1
generation_config = GenerationConfig(
    max_length=max_length,
    max_new_tokens=max_new_tokens,
    num_return_sequences=num_return_sequences,
    temperature=temperature,
    top_p=top_p,
    repetition_penalty=repetition_penalty,
    do_sample=True,
    bos_token_id=tokenizer.bos_token_id,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
)

world_model = CourtRoomV2(
    max_length=generation_config.max_length,
    base_tokenizer=tokenizer,
    generation_config=generation_config,
    opponent_model=actor_model
)
search_cfg = LawStepLMConfig(SearchArgs(
    ref_policy_model=actor_reference_model,
    base_tokenizer=tokenizer,
    generation_config=generation_config,
    n_actions=n_actions,
    n_init_actions=n_init_actions,
    breadth_limit=breadth_limit,
    depth_limit=depth_limit,
    force_terminating_on_depth_limit=force_terminating_on_depth_limit,
    kl_coeff=kl_coeff,
    disable_tqdm=False,
    no_self_eval=no_self_eval,
    reward_model=reward_model,
    reward_tokenizer=reward_tokenizer, 
    use_code=use_code,
    use_mcq=use_mcq,
    eval_mode=eval_mode,
    temperature=temperature,
    init_temperature=init_temperature,
    get_tp_zero=get_tp_zero,
    model_type=model_type,
    include_gt=include_gt,
))

mcts_algo = LawMCTS(MCTSConfig(
    w_exp=w_exp,
    depth_limit=depth_limit,
    breadth_limit=breadth_limit,
    n_iters=n_iters,
    temperature=mcts_temperature,
    temperature_decay_ratio=mcts_temperature_decay_ratio,
    consider_diversity=(not no_consider_diversity),
    length_penalty=mcts_length_penalty,
))
mcts_searcher = TreeConstructor(
    world_model=world_model, 
    search_config=search_cfg, 
    search_algo=mcts_algo,
)

# set all clients
world_model.debatrix.session_id = "debatrix"
world_model.set_debatrix_config()
world_model.set_debatrix_server()
world_model.set_debatrix_debateinfo()

Debatrix judging

we create a CourtRoomV2 model, which is a pseudo-wrapper around the Debatrix framework to judge the debate:

the main condition is judging the actions of the given players relative to each-other as well as the conditions that are set by the scoring criteria (eg, the judge’s rules)

def judge(self, message_state: List[dict]) -> bool:
    debate_info = DebateInfo(
        motion="motion",
        pro_side=DebaterInfo(name="player1"),
        con_side=DebaterInfo(name="player2"),
        info_slide="info_slide",
        speech_order=("player1", "player2"),
    )
    # reset debatrix
    self.debatrix._arena.reset()
    self.debatrix._panel.reset()
    # create counter
    self.debatrix._debater_counter = {
        debater_info.name: 0 for debater_info in self.debatrix._debate_info.all_debaters_info
    }
    self.debatrix._main_counter = 0

    # populate speech queue
    queue = Queue()
    for message in message_state:
        self.debatrix._main_counter += 1
        self.debatrix._debater_counter[message['role']] += 1
        turn = Turn(
            index=self.debatrix._main_counter,
            index_by_debater=self.debatrix._debater_counter[message['role']],
            debater_name=message['role'],
            content=message['content'],
        )
        queue.put(turn)
        self.debatrix._arena.update(turn=turn)
    queue.put(None)

    turns: list[Turn] = []
    while True:
        turn: Turn | None = queue.get()
        if turn is None:
            break
        self.debatrix._panel.update(turn=turn)
        turns.append(turn)

    dimensional_verdicts = self.debatrix._panel.dimensional_judge()
    final_verdict: Verdict | None = None

    final_verdict = self.panel.summarize(verdicts=dimensional_verdicts)

    judge_result =  DebateResult(
        speeches=tuple(speeches),
        dimensional_verdicts=dimensional_verdicts,
        final_verdict=final_verdict,
    )
    # player1 is the the model, plae
    winner = judge_result.final_verdict
    return winner == 'player1' # is this ok?

Tree Construction


val_inputs = tokenizer(
    prompt_only_batch['player1'],
    add_special_tokens=True,
    return_tensors='pt',
)
mcts_searcher.search_config.generation_config.max_length = eval_inputs['input_ids'].size(-1) + max_length
mcts_searcher.world_model.generation_config.max_length = eval_inputs['input_ids'].size(-1) + max_length


"""Rollout a batch of experiences."""
input_ids = prompt_only_batch['input_ids']
attention_mask = prompt_only_batch['attention_mask']
opponent_prompt = prompt_only_batch['player2'] # could call it whater
assert input_ids.size(0) == 1, '''Only support one instance per device.'''
seq, attn_msk = input_ids[0], attention_mask[0]


global_step=1
mcts_searcher.search_algo.policy_model = actor_reference_model
target_probs, Q_values, r_values, base_values, visit_counts, select_indexes = [], [], [], [], [], []
cur_node = None
# initialize debatrix
world_model.debatrix.initialize(world_model.debate_info, world_model.server_info)

# Initialize counters
world_model.debatrix._debater_counter = {debater_info.name: 0 for debater_info in world_model.debate_info.all_debaters_info}
world_model.debatrix._main_counter = 0

world_model.device = torch.device('cuda')

# start one iter for sampling
mcts_rst = mcts_searcher(
    {
        'input_ids': seq, 'attention_mask': attn_msk,
        "opponent_prompt": opponent_prompt,
        "base_prompt": prompt_only_batch["player1"],
        "prompt": prompt_only_batch["player1"][0] 
    }, 
    node=cur_node
)
results=[]
# explore the post-tree contstructor
pi, cur_node = mcts_rst.next_action_pi, mcts_rst.tree_state
target_probs.append(pi)
Q_values.append([child.Q for child in cur_node.children])
r_values.append([child.r for child in cur_node.children])
base_values.append([child.value for child in cur_node.children])
visit_counts.append([child.N for child in cur_node.children])
# results.append([child.result for child in cur_node.children])

cur_node = cur_node.children[mcts_rst.next_action_idx]
select_indexes.append(mcts_rst.next_action_idx)

prompt =  eval_inputs['input_ids'][0]

Post-Tree Construction

from collections import deque
message_histories, candidates, init_value_list, step_id = [], [], [], 0
while cur_node.children:
    depth = 0
    next_message_histories = []
    for child in cur_node.children:
        queue = deque([(child, depth)])
        while queue:
            node, depth = queue.popleft()  # Get the front item
            if node.children:
                for grandchild in node.children:
                    queue.append((grandchild, depth + 1))  # Add children to the queue with incremented depth
            next_message_histories.append((node, depth))
    
    # filter next message histories to get the max depth
    max_depth = max(next_message_histories, key=lambda x: x[1])[1]
    next_message_histories = [x[0].message_history for x in next_message_histories if x[1] == max_depth]

    scores = [
        (q, s, r, bv, vc)
        for s, q, r, bv, vc in zip(
            target_probs[step_id],
            Q_values[step_id],
            r_values[step_id],
            base_values[step_id],
            visit_counts[step_id],
        )
    ]
    # Sort candidates based on scores
    _candidates = [
        [next_message_histories[idx], scores[idx]]
        for idx in range(len(next_message_histories))
    ]
    _candidates.sort(key=lambda x: x[1])

    init_values = [score[-1] for _, score in _candidates]
    candidate_histories = [messages for messages, _ in _candidates]

    # Store the data
    candidates.append(candidate_histories)
    init_value_list.append(init_values)

    # Move to the selected child
    cur_node = cur_node.children[select_indexes[step_id]]
    step_id += 1

mini_batches = {
    'prompts_list': [],
    'input_ids_list': [],
    'attention_mask_list': [],
    'init_value_list': [],
}

for base_history, candidate_histories, init_values in zip(
    message_histories, candidates, init_value_list
):
    input_texts = []
    for candidate_history in candidate_histories:
        # Serialize the full message history
        full_message_history = candidate_history
        input_text = serialize_message_history(full_message_history)
        input_texts.append(input_text)

    # Tokenize the input texts
    encodings = tokenizer(
        input_texts,
        padding=True,
        truncation=True,
        return_tensors='pt',
        max_length=generation_config.max_length,
    )
    input_ids = encodings['input_ids']
    attention_mask = encodings['attention_mask']

    mini_batches['prompts_list'].append(None)  # Optional: store base histories if needed
    mini_batches['input_ids_list'].append(input_ids)
    mini_batches['attention_mask_list'].append(attention_mask)
    mini_batches['init_value_list'].append(init_values)

# Compute final prediction and correctness if applicable
# For now, we can set the prediction to None or compute based on your game logic
mini_batches['prediction'] = None
mini_batches['cur_max_new_tokens'] = cur_max_new_tokens

Training

tree-contructor for RL batch

# tree-constructor for RL batch
total_batch_size = prompt_only_batch['input_ids'].size(0) if not self.args.post \
    else len(prompt_only_batch['prompts_list'])
micro_batch_size = self.args.per_device_train_batch_size
micro_batches = []
assert total_batch_size == micro_batch_size
micro_batches.extend(self.tree_constructor(prompt_only_batch))

calling the constructor

rl_batches = self.tree_constructor(prompt_only_batch)

the trainging step from dpo-mcts (dpo_mcts_trainer.py)

def tsrl_step(
    self, 
    prompts_list: list[torch.Tensor], 
    input_ids_list: list[torch.Tensor],
    attention_mask_list: list[torch.Tensor],
    prediction: tuple = None,
    init_value_list: list[float] = None,
    max_n_sample: int = 8,
    cur_max_new_tokens: int = 32,
) -> dict[str, Any]:
    losses, better_sample_rewards, worse_sample_rewards, max_lengths = [], [], [], []
    n_sample = len(input_ids_list)
    start = prompts_list[0].size(-1) - 1
    better_idx = -1
    worse_idx = 0 if self.args.choose_worst else -2
    
    all_better_input_ids, all_worse_input_ids = [], []
    all_better_attention_mask, all_worse_attention_mask = [], []
    all_init_value_list = []
    for sample_id in range(n_sample):
        if len(all_better_input_ids) >= max_n_sample: break
        
        input_ids = input_ids_list[sample_id]
        attention_mask = attention_mask_list[sample_id]
        
        n_output = input_ids.size(0)
        if n_output < 2: continue
        
        if self.args.choose_random:
            worse_idx = random.choice(range(n_output - 1))
            
        all_better_input_ids.append(input_ids[better_idx])
        all_worse_input_ids.append(input_ids[worse_idx])
        all_better_attention_mask.append(attention_mask[better_idx])
        all_worse_attention_mask.append(attention_mask[worse_idx])
        all_init_value_list.extend([init_value_list[sample_id][better_idx], init_value_list[sample_id][worse_idx]])
    all_input_ids = pad_tensors(all_better_input_ids + all_worse_input_ids, pad_value=self.tokenizer.pad_token_id)
    all_attention_mask = pad_tensors(all_better_attention_mask + all_worse_attention_mask, pad_value=False)
    
    torch.cuda.empty_cache()
    all_sequence_log_probs = self.compute_log_probs(
        self.actor_model.module,
        input_ids=all_input_ids,
        attention_mask=all_attention_mask,
    )
    all_better_input_ids, all_worse_input_ids = all_input_ids.chunk(chunks=2, dim=0)
    all_better_attention_mask, all_worse_attention_mask = all_attention_mask.chunk(chunks=2, dim=0)
    all_better_sequence_log_probs, all_worse_sequence_log_probs = all_sequence_log_probs.chunk(chunks=2, dim=0)
    
    label_smoothing_values = []
    for sample_id in range(len(all_better_input_ids)):
        better_input_ids = all_better_input_ids[sample_id]
        better_attention_mask = all_better_attention_mask[sample_id]
        
        worse_input_ids = all_worse_input_ids[sample_id]
        worse_attention_mask = all_worse_attention_mask[sample_id]
        
        init_values = [all_init_value_list[sample_id * 2], all_init_value_list[sample_id * 2 + 1]]
        better_sequence_log_probs, worse_sequence_log_probs = all_better_sequence_log_probs[sample_id], all_worse_sequence_log_probs[sample_id]
        
        with torch.no_grad():
            torch.cuda.empty_cache()
            ref_better_sequence_log_probs = self.compute_log_probs(
                self.actor_reference_model.module,
                input_ids=better_input_ids.unsqueeze(0),
                attention_mask=better_attention_mask.unsqueeze(0),
            )[0]
            torch.cuda.empty_cache()
            ref_worse_sequence_log_probs = self.compute_log_probs(
                self.actor_reference_model.module,
                input_ids=worse_input_ids.unsqueeze(0),
                attention_mask=worse_attention_mask.unsqueeze(0),
            )[0]
        
        better_end_index = better_attention_mask.nonzero()[-1]
        worse_end_index = worse_attention_mask.nonzero()[-1]
        try:
            diverge_index = (better_input_ids != worse_input_ids).nonzero()[0]
            assert 0 <= diverge_index <= better_end_index, 'diverge index is out of range!'
            assert 0 <= diverge_index <= worse_end_index, 'diverge index is out of range!'
        except:
            continue
        
        better_seq_slice = slice(diverge_index - 1, better_end_index)
        worse_seq_slice = slice(diverge_index - 1, worse_end_index)
        
        better_log_probs = better_sequence_log_probs[better_seq_slice].sum(dim=-1)
        worse_log_probs = worse_sequence_log_probs[worse_seq_slice].sum(dim=-1)
        ref_better_log_probs = ref_better_sequence_log_probs[better_seq_slice].sum(dim=-1)
        ref_worse_log_probs = ref_worse_sequence_log_probs[worse_seq_slice].sum(dim=-1)
        better_log_ratio = better_log_probs - ref_better_log_probs
        worse_log_ratio = worse_log_probs - ref_worse_log_probs
        if self.args.norm_prob or self.args.ipo:
            better_log_ratio /= better_attention_mask[better_seq_slice].sum(dim=-1) ** self.args.length_penalty
            worse_log_ratio /= worse_attention_mask[worse_seq_slice].sum(dim=-1) ** self.args.length_penalty
        logits = better_log_ratio - worse_log_ratio
        
        if self.args.ipo:
            losses.append((logits - 1 / (2 * self.scale_coeff)) ** 2)
        elif self.args.conservative:
            qb, qw = init_values
            confidence = calculate_preference_confidence(qb, qw)
            label_smoothing = min(1 - confidence, 0.5)
            losses.append(
                - F.logsigmoid(self.scale_coeff * logits) * (1 - label_smoothing)
                - F.logsigmoid(-self.scale_coeff * logits) * label_smoothing
            )
            label_smoothing_values.append(label_smoothing)
        else:
            losses.append(-F.logsigmoid(self.scale_coeff * logits))
        better_sample_rewards.append(self.scale_coeff * better_log_ratio.detach())
        worse_sample_rewards.append(self.scale_coeff * worse_log_ratio.detach())
        
        max_lengths.append(better_attention_mask[start:].float().sum())
        max_lengths.append(worse_attention_mask[start:].float().sum())
    
    if not len(losses): return {}
    
    loss = torch.stack(losses).mean()
    max_generated_length = torch.stack(max_lengths).max()
    total_max_generated_length = max_generated_length + start
    better_sample_rewards = torch.stack(better_sample_rewards)  # size = (B,)
    worse_sample_rewards = torch.stack(worse_sample_rewards)  # size = (B,)
    rewards_accuracy = (
        (better_sample_rewards > worse_sample_rewards).float().mean()
    )  # size = ()
    better_sample_rewards = better_sample_rewards.mean()  # size = ()
    worse_sample_rewards = worse_sample_rewards.mean()  # size = ()
    rewards = better_sample_rewards + worse_sample_rewards  # size = ()
    rewards_margin = better_sample_rewards - worse_sample_rewards  # size = ()
    
    torch.cuda.empty_cache()
    self.actor_model.backward(loss)
    self.actor_model.step()
    
    loss = get_all_reduce_mean(loss)
    rewards = get_all_reduce_mean(rewards)
    better_sample_rewards = get_all_reduce_mean(better_sample_rewards)
    worse_sample_rewards = get_all_reduce_mean(worse_sample_rewards)
    rewards_accuracy = get_all_reduce_mean(rewards_accuracy)
    rewards_margin = get_all_reduce_mean(rewards_margin)
    max_generated_length = get_all_reduce_max(max_generated_length)
    total_max_generated_length = get_all_reduce_max(total_max_generated_length)
    
    return {
        'train/loss': loss.item(),
        'train/rewards': rewards.item(),
        'train/better_sample_rewards': better_sample_rewards.item(),
        'train/worse_sample_rewards': worse_sample_rewards.item(),
        'train/rewards_accuracy': rewards_accuracy.item(),
        'train/rewards_margin': rewards_margin.item(),
        'train/lr': self.actor_model.optimizer.param_groups[0]['lr'],
        'train/r_scores': float(prediction[0]),
        'train/correct': float(prediction[1]),
        'train/n_sample': n_sample,
        'train/max_generated_length': max_generated_length.item(),
        'train/total_max_generated_length': total_max_generated_length.item(),
        'train/label_smoothing': sum(label_smoothing_values) / len(label_smoothing_values) if len(label_smoothing_values) else 0,
        'train/cur_max_new_tokens': cur_max_new_tokens,
    }

sample data using a base prompt

at the end of the mcts_searcher function we end up with data that looks sort of like this:

a system prompt, and multi-turn user prompts where we change the ‘role’ from ‘player1’ to ‘user’ and ‘player2’ to ‘assistant’

[{'role': 'system',
  'content': 'You are an e...'},
 {'role': 'player2',
  'content': '\n**Opening Argument:**\n\nLadies and Gentlemen of the Court,\n\nToday,...'},
 {'role': 'player1',
  'content': 'counsel may argue that the "fundamental miscarriage of justice" e....'},
  {'role': 'player2',
  'content': '**Closing Argument:**\n\nLadies and Gentlemen of the Court,\n\n...'},
  {'role': 'player1',
  'content': ''**Closing Argument:**\n\nLadies and Gentlemen of the ...'}]

update data w/ necessary changes for training

[{'role': 'system',
  'content': 'You are an experienced prosecutor participating in a legal debate against an opposing attorney. You will be provided with case details, which include:\n\n- Title: McQuiggin v. Perkins  \nCourt: United States Supreme Court  \nCase Identifier: 133 S. Ct. 1924 (2013)  \nRule of Law: Habeas corpus petitions alleging actual innocence may be exempt from the one-year statute of limitations.\n\nFacts:  \nPerkins (defendant) attended a party with Henderson and Jones. Henderson was later found murdered. Perkins was convicted of the murder and sentenced to life without parole on October 27, 1993. Appeals were denied, finalizing the conviction on May 5, 1997. In July 2002, Perkins obtained affidavits supporting his innocence. He filed a habeas corpus petition on June 13, 2008, which the trial court deemed time-barred. The Sixth Circuit reversed, stating actual innocence claims are exempt from the one-year limit. The state petitioned for Supreme Court review.\n\nIssue: \nCan habeas corpus petitions alleging actual innocence be exempt from the one-year statute of limitations?\n\nHolding and Reasoning by Ginsburg, J.:\nYes. The AEDPA\'s one-year statute of limitations can be equitably tolled in exceptional cases where actual innocence is convincingly shown. The AEDPA does not clearly override traditional equitable authority. The "fundamental miscarriage of justice" exception applies, allowing habeas petitions to proceed even after the limit. The trial court should assess if Perkins meets this demanding standard and consider delays in presenting the claim.\n\nDissent by Scalia, J.:\nThe fundamental miscarriage of justice exception should be limited to judicially-created procedural bars. The AEDPA provides a strict statutory bar, and courts lack the authority to create exceptions.\n\n**Your Role:**\n\n- Given the issue at hand, argue in favor of ruling.\n\n**Instructions:**\n\n1. **Opening Argument**: Present a clear and persuasive opening statement supporting the prosecution\'s position.\n2. **Legal Analysis**: Use relevant facts, statutes, and precedents to strengthen your argument.\n3. **Anticipate Counterarguments**: Identify potential arguments the opposing attorney might raise and preemptively address them.\n4. **Rebuttals**: After the opposing attorney presents their arguments, provide counterpoints to refute their claims.\n5. **Closing Argument**: Summarize the key points that support the prosecution\'s case and explain why the court should rule in your favor.\n6. **Professionalism**: Maintain formal language and adhere to legal ethics throughout the debate.\n\n**Guidelines:**\n\n- Structure your arguments logically with clear headings.\n- Focus on legal reasoning and evidence.\n- Do not introduce facts outside the provided case details.\n- Aim to outmaneuver the opposing attorney by highlighting weaknesses in their arguments.\n\n**Note:** Use the specific case information provided above when preparing your arguments.'},
 {'role': 'assistant',
  'content': '\n**Opening Argument:**\n\nLadies and Gentlemen of the Court,\n\nToday, we stand before you to argue against the notion that habeas corpus petitions alleging actual innocence can be exempt from the one-year statute of limitations. The Antiterrorism and Effective Death Penalty Act (AEDPA) of 1996 clearly establishes a strict one-year statute of limitations for filing habeas corpus petitions. This statute is designed to ensure that claims are brought forth in a timely manner, preventing stale evidence and allowing for the efficient administration of justice.\n\nIn the case of McQuiggin v. Perkins, the defendant, Perkins, was convicted of murder and sentenced to life without parole in 1993. His appeals were denied in 1997. It is only in 2002 that he obtained affidavits supporting his innocence and filed a habeas corpus petition in 2008, more than a decade after his conviction. This significant delay raises concerns about the reliability of the evidence and the potential for other exculpatory evidence to have been lost or destroyed.\n\n**Legal Analysis:**\n\nThe AEDPA\'s one-year statute of limitations is a clear and unambiguous provision that does not allow for equitable tolling in the absence of extraordinary circumstances. The AEDPA was enacted to address concerns about the abuse of the habeas corpus process, which had become a tool for delaying executions and undermining the finality of criminal convictions. The one-year statute of limitations is a key component of the AEDPA, designed to prevent such abuses.\n\nThe majority opinion in McQuiggin v. Perkins relies on the "fundamental miscarriage of justice" exception to the one-year statute of limitations. However, this exception is not applicable in this case. The exception is intended to address situations where a constitutional violation has resulted in a fundamental miscarriage of justice, and the petitioner can demonstrate that the delay in filing the petition was due to extraordinary circumstances beyond their control. In this case, Perkins\' delay in filing his petition was not due to extraordinary circumstances but rather his own inaction.\n\n**Anticipate Counterarguments:**\n\nThe'},
 {'role': 'user', 
  'content': 'counsel may argue that the "fundamental miscarriage of justice" exception should be expanded to include cases of actual innocence. However, such an expansion would undermine the purpose of the AEDPA and create uncertainty in the application of the statute of limitations. The AEDPA was designed to provide finality to criminal convictions, and expanding the exception to include actual innocence would make it more difficult for the government to enforce criminal convictions and protect the public.\n**Rebuttals:**\nIf the opposing counsel argues that the delay in filing the petition was due to Perkins\' inability to obtain the necessary evidence, we would counter that Perkins had ample time to pursue this evidence after his conviction. The fact that he did not do so until more than a decade later demonstrates a lack of diligence on his part. Furthermore, the delay in filing the petition raises concerns about the reliability of the evidence and the potential for other exculpatory evidence to have been lost or destroyed.\n**Closing Argument:**\nIn conclusion, the AEDPA\'s one-year statute of limitations is a clear and unambiguous provision that does not allow for equitable tolling in the absence of extraordinary circumstances. The "fundamental miscarriage of justice" exception is not applicable in this case, as Perkins\' delay in filing his petition was not due to extraordinary circumstances but rather his own inaction. Expanding the exception to include actual innocence would undermine the purpose of the AEDPA and create uncertainty in the application of the statute of limitations. We respectfully request that the Court uphold the AEDPA\'s one-year statute of limitations and deny Perkins\' habeas corpus petition.\n**Professionalism:**\n'},
  ...
  ]

There’s more to this, but I’ll leave it at that for now.

The Giving (Search) Tree