1. What does this tool do
This free online text splitter splits text by punctuation, word, phrase, or custom regex pattern. You get an ordered list of chunks that you can copy as newline- or comma-separated. Options let you trim each chunk and omit empty segments. Use it to break text into parts for data prep, parsing, or further processing. No sign-up, no upload; all processing runs in your browser.
2. How to use it
Quick start: Choose a mode (Punctuation, Word, Phrase, or Pattern), enter a delimiter or pattern if needed, paste your text, then click Split. Copy the chunks in your preferred format.
- Select mode тАФ Punctuation, Word, Phrase, or Pattern.
- Punctuation mode тАФ Choose All punctuation (any Unicode punctuation) or Selected punctuation and check the characters to split on (e.g. period, comma, semicolon).
- Phrase mode тАФ Enter the exact delimiter string (e.g.
" and ","---"). Choose whether to Remove the delimiter, Keep with next chunk, or Keep with previous chunk. Empty delimiter is not allowed. - Pattern mode тАФ Enter one or more JavaScript regex patterns and optional flags (g, i, m). You can add multiple patterns; they are applied in order: the first pattern splits the text, the second splits each of those chunks, and so on. The result is one flat list of chunks, then trim/omit empty apply. Example: split by double newline (paragraphs), then by sentence-ending punctuation to get a list of sentences. The same validation and length limits as the Regex Cleaner apply.
- Options тАФ Check Trim each chunk and Omit empty chunks to clean the result (both default on).
- Enter or paste text тАФ Type or paste into the input area.
- Click Split тАФ The tool splits the text and shows the chunk count and list.
- Copy or Export тАФ Copy as Newline or Comma, or click Export CSV to download a CSV file (optionally with an index column).
3. How it works
- Punctuation тАФ All: splits on one or more Unicode punctuation characters (
\p{P}). Selected: splits only on the punctuation characters you check (e.g..,!). Chunks are the segments between those runs. - Word тАФ Splits on whitespace (same logic as the Text TokenizerтАЩs Words mode). Leading/trailing punctuation on each word is not stripped for the split boundaries; trim/omit empty still apply.
- Phrase тАФ Splits on the literal delimiter string you provide. No regex; the delimiter is used as-is. You can Remove the delimiter from output, Keep with next chunk (each chunk after the first starts with the delimiter), or Keep with previous chunk (each chunk except the last ends with the delimiter). Trim applies to the whole chunk.
- Pattern тАФ Builds a regex from your pattern and flags, then uses
String.prototype.split(regex). You can add multiple patterns; they are applied in sequence (first pattern splits the input, second splits each resulting chunk, etc.), producing one flat list of chunks before trim/omit empty. Pattern and input length limits match the Regex Cleaner to reduce ReDoS and abuse.
Export CSV тАФ After splitting, you can download the chunks as a CSV file with one row per chunk. Optionally include a 1-based index column. Fields are escaped per RFC 4180 (commas, newlines, and double quotes).
All computation runs in your browser. No data is sent to any server.
4. Use cases & examples
- Data preparation тАФ Split CSV-like or log lines by a delimiter (phrase or pattern).
- Sentence or clause extraction тАФ Split by punctuation to get rough segments.
- Word lists тАФ Use Word mode to get a list of words (with trim/omit empty).
- Custom parsing тАФ Use Pattern mode with a regex when you need flexible split rules.
Example (Phrase mode, delimiter " and ", Remove):
Input: "apples and oranges and bananas"
Chunks: apples, oranges, bananas
Example (Phrase mode, Keep with next chunk):
Input: "a|b|c", delimiter |
Chunks: a, |b, |c
Example (Punctuation mode, All):
Input: "Hello. World! How are you?"
Chunks (with trim/omit empty): Hello, World, How are you
Example (Punctuation mode, Selected: period and comma):
Input: "One. Two, three."
Chunks: One, Two, three
5. Limitations & known constraints
- Input cap тАФ Maximum 100,000 characters. Larger input returns an error.
- Pattern mode тАФ Pattern length limited to 500 characters; same validation as Regex Cleaner. JavaScript regex syntax only.
- Client-side only тАФ No server; processing runs in the browser. Very large inputs may cause brief UI lag on slower devices.
- Phrase mode тАФ Delimiter is case-sensitive and literal; no regex.