The LLM-Remove LaTeX Comment Lines (DLC) component removes both full-line and inline comments from TeX-formatted text. The input OSS data file must be in the JSONL format (example). In a JSONL file, each line is a self-contained JSON object.
Supported computing resources
Algorithm
The component uses the following regular expressions to identify comments in LaTeX-formatted text:
|
Comment type |
Regular expression |
|
Full-line comments |
|
|
Inline comments |
|
The component finds all strings that match these regular expressions and replaces them with an empty string. The following example shows how this works.
|
Before processing
|
After processing In the Current Field Value dialog box, the field contains the cleaned LaTeX source code, including the following lines: |
Configure the component
On the Designer workflow page, add the LLM-Remove LaTeX Comment Lines (DLC) component and configure its parameters in the right-side pane.
|
Parameter type |
Parameter |
Required |
Description |
Default |
|
|
Field settings |
Field to process |
Yes |
The field to process. |
None |
|
|
Remove all full-line comments |
No |
Specifies whether to remove all full-line comments. |
Selected |
||
|
Remove all inline comments |
No |
Specifies whether to remove all inline comments. |
Selected |
||
|
OSS output directory |
No |
The OSS directory for the output data. If empty, the component uses the default workspace path. |
None |
||
|
Execution tuning |
Number of processes |
No |
The number of concurrent processes for the job. |
8 |
|
|
Select resource group |
Public resource group |
No |
Select a node specification (CPU or GPU instance specification), the number of nodes, and a VPC. |
None |
|
|
Dedicated resource group |
No |
Select the number of CPU cores, memory, shared memory, number of GPUs, and number of nodes. |
None |
||
|
Maximum runtime |
No |
The component's maximum runtime. The job is terminated if it exceeds this limit. |
None |
||