LLM-LaTeX Expand Macro (MaxCompute)

更新时间:
复制 MD 格式

The LLM-LaTeX Expand Macro component preprocesses TeX-formatted text for large language models (LLMs). Its primary function is to expand parameterless macros whose names consist only of letters and numbers by replacing the name with its value.

Supported computing resources

MaxCompute

How it works

The LLM-LaTeX Expand Macro component expands parameterless macros inline using the following regular expressions:

Item

Parameterless \newcommand macro

Parameterless \def macro

Regular expression

r'\\\bnewcommand\b\*?\{(\\[a-zA-Z0-9]+?)\}\{(.*?)\}$'

r'\\def\s*(\\[a-zA-Z0-9]+?)\s*\{(.*?)\}$'

Content matched by the regular expression

\newcommand{\macro_name}{macro_value}

\newcommand*{\macro_name}{macro_value}

\def\macro_name{macro_value}

Description

The macro_name can contain only letters and numbers. The macro_value can contain any characters.

The component extracts all strings that match the preceding regular expressions and replaces macro_name with macro_value. The following is an example.

Before processing

\usepackage{microtype}
\usepackage{graphicx}

% Attempt to make hyperref and algorithmic work together better:
\newcommand{\theHalgorithm}{\arabic{algorithm}}

% For theorems and such
\usepackage{amsmath}

After processing

After processing, \newcommand{\arabic{algorithm}}{\arabic{algorithm}} is the key definition, and the complete field value is as follows:

\usepackage{microtype}
\usepackage{graphicx}

% Attempt to make hyperref and algorithmic work together better:
\newcommand{\arabic{algorithm}}{\arabic{algorithm}}
% For theorems and such
\usepackage{amsmath}

Configure the component

In Designer, add the LLM-LaTeX Expand Macro component to your pipeline. Then, configure the parameters in the pane on the right.

Parameter group

Parameter

Description

Field settings

Select columns to process

Select one or more columns to process.

Output table lifecycle

Specifies the lifecycle of the temporary table in days. The table is deleted after this period. The value must be a positive integer. Default: 28.

Tuning

Number of CPUs per instance

The CPU resources for each map task instance. A value of 100 represents one vCPU. Range: 50–800. Default: 100.

Memory size per instance (MB)

The memory for each map task instance, in MB. Range: 256–12288. Default: 1024.

Data size per instance (MB)

The maximum data that each map task instance can process, in MB. Range: 1 to Integer.MAX_VALUE. Default: 256.

This parameter controls the input data size for map tasks.