Sentence Chunker¶
- class SentenceChunker¶
Designed to split input text into smaller chunks, particularly useful for processing large documents or texts. Tries to keep sentences and paragraphs together.
- Parameters:
chunk_size (int, optional) – Size of each chunk. Default is 512.
chunk_overlap (int, optional) – Amount of overlap between chunks. Default is 256.
separator (str, optional) – Separator used for splitting text. Default is “ “.
Example
from pineflow.core.text_chunkers import SentenceChunker text_chunker = SentenceChunker()
- from_documents(documents)¶
Split documents into chunks.
- from_text(text)¶
Split text into chunks.
- Parameters:
text (str) – Input text to split.
- Returns:
List of text chunks.
- Return type:
List[str]
Example
chunks = text_chunker.from_text( "Pineflow is a data framework to load any data in one line of code and connect with AI applications." )