nlpmed_engine.api package

Submodules

nlpmed_engine.api.main module

nlpmed_engine.api.mappers module

Mapping functions between Pydantic models and internal data structures.

This module provides a set of mapping functions to convert between Pydantic models used in the API layer and internal data structures used for processing within NLPMed-Engine. These mappers ensure data consistency and facilitate seamless transformation between API inputs/outputs and internal processing logic.

Functions:
map_pydantic_to_internal_sentence:

Maps a Pydantic SentenceModel to an internal Sentence object.

map_internal_to_pydantic_sentence_model:

Maps an internal Sentence object to a Pydantic SentenceModel.

map_pydantic_to_internal_section:

Maps a Pydantic SectionModel to an internal Section object.

map_internal_to_pydantic_section_model:

Maps an internal Section object to a Pydantic SectionModel.

map_pydantic_to_internal_note:

Maps a Pydantic NoteModel to an internal Note object.

map_internal_to_pydantic_note_model:

Maps an internal Note object to a Pydantic NoteModel.

map_pydantic_to_internal_patient:

Maps a Pydantic PatientModel to an internal Patient object.

map_internal_to_pydantic_patient_model:

Maps an internal Patient object to a Pydantic PatientModel.

nlpmed_engine.api.mappers.map_internal_to_pydantic_note_model(note: Note) NoteModel

Map an internal Note object to a Pydantic NoteModel.

Args:

note (Note): The internal Note object to be converted.

Returns:

NoteModel: The corresponding Pydantic model with mapped sections and attributes.

nlpmed_engine.api.mappers.map_internal_to_pydantic_patient_model(patient: Patient) PatientModel

Map an internal Patient object to a Pydantic PatientModel.

Args:

patient (Patient): The internal Patient object to be converted.

Returns:

PatientModel: The corresponding Pydantic model with mapped notes.

nlpmed_engine.api.mappers.map_internal_to_pydantic_section_model(section: Section) SectionModel

Map an internal Section object to a Pydantic SectionModel.

Args:

section (Section): The internal Section object to be converted.

Returns:

SectionModel: The corresponding Pydantic model with mapped sentences and attributes.

nlpmed_engine.api.mappers.map_internal_to_pydantic_sentence_model(sentence: Sentence) SentenceModel

Map an internal Sentence object to a Pydantic SentenceModel.

Args:

sentence (Sentence): The internal Sentence object to be converted.

Returns:

SentenceModel: The corresponding Pydantic model with mapped attributes.

nlpmed_engine.api.mappers.map_pydantic_to_internal_note(note_model: NoteModel) Note

Map a Pydantic NoteModel to an internal Note object.

Args:

note_model (NoteModel): The Pydantic model representing a note.

Returns:

Note: The internal Note object with mapped sections and attributes.

nlpmed_engine.api.mappers.map_pydantic_to_internal_patient(patient_model: PatientModel) Patient

Map a Pydantic PatientModel to an internal Patient object.

Args:

patient_model (PatientModel): The Pydantic model representing a patient.

Returns:

Patient: The internal Patient object with mapped notes.

nlpmed_engine.api.mappers.map_pydantic_to_internal_section(section_model: SectionModel) Section

Map a Pydantic SectionModel to an internal Section object.

Args:

section_model (SectionModel): The Pydantic model representing a section.

Returns:

Section: The internal Section object with mapped sentences and attributes.

nlpmed_engine.api.mappers.map_pydantic_to_internal_sentence(sentence_model: SentenceModel) Sentence

Map a Pydantic SentenceModel to an internal Sentence object.

Args:

sentence_model (SentenceModel): The Pydantic model representing a sentence.

Returns:

Sentence: The internal Sentence object with the corresponding attributes.

nlpmed_engine.api.models module

Pydantic models for NLPMed-Engine API.

This module defines the Pydantic models used for validating and managing the input, output, and configuration data for the NLPMed-Engine. The models provide a structured representation of various components involved in text processing, including sentences, sections, notes, patients, and different text processing components.

Classes:
StringInputModel:

Model for string inputs to be processed.

ComponentStatusModel:

Base model for component status.

EncodingFixerModel:

Model for encoding fixer component status.

PatternReplacerModel:

Model for pattern replacer component with pattern and target replacements.

WordMaskerModel:

Model for word masker component with mask settings.

NoteFilterModel:

Model for filtering notes based on keywords.

SectionSplitterModel:

Model for splitting sections using a delimiter.

SectionFilterModel:

Model for filtering sections based on include and exclude keywords.

SentenceSegmenterModel:

Model for sentence segmentation settings.

DuplicateCheckerModel:

Model for duplicate checking configuration.

SentenceFilterModel:

Model for filtering sentences based on keywords.

SentenceExpanderModel:

Model for expanding short sentences.

JoinerModel:

Model for joining sentences and sections.

MLInferenceModel:

Model for machine learning inference settings.

ConfigModel:

Configuration model for all text processing components.

SentenceModel:

Model for representing a sentence with various attributes.

SectionModel:

Model for representing a section with sentences.

NoteModel:

Model for representing a note with sections and preprocessed text.

PatientModel:

Model for representing a patient with associated notes.

TextProcessingResponseModel:

Model for responses related to text processing output.

class nlpmed_engine.api.models.ComponentStatusModel(*, status: str)

Bases: BaseModel

Base model for component status.

Attributes:

status (str): Status of the component (‘enabled’, ‘disabled’, ‘excluded’).

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: str
class nlpmed_engine.api.models.ConfigModel(*, encoding_fixer: dict | None = None, pattern_replacer: dict | None = None, word_masker: dict | None = None, note_filter: dict | None = None, section_splitter: dict | None = None, section_filter: dict | None = None, sentence_segmenter: dict | None = None, duplicate_checker: dict | None = None, sentence_filter: dict | None = None, sentence_expander: dict | None = None, joiner: dict | None = None, ml_inference: dict | None = None, debug: bool = False)

Bases: BaseModel

Configuration model for all text processing components.

Attributes:

encoding_fixer (dict | None): Configuration for the encoding fixer component. pattern_replacer (dict | None): Configuration for the pattern replacer component. word_masker (dict | None): Configuration for the word masker component. note_filter (dict | None): Configuration for the note filter component. section_splitter (dict | None): Configuration for the section splitter component. section_filter (dict | None): Configuration for the section filter component. sentence_segmenter (dict | None): Configuration for the sentence segmenter component. duplicate_checker (dict | None): Configuration for the duplicate checker component. sentence_filter (dict | None): Configuration for the sentence filter component. sentence_expander (dict | None): Configuration for the sentence expander component. joiner (dict | None): Configuration for the joiner component. ml_inference (dict | None): Configuration for the machine learning inference component.

debug: bool
duplicate_checker: dict | None
encoding_fixer: dict | None
joiner: dict | None
ml_inference: dict | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

note_filter: dict | None
pattern_replacer: dict | None
section_filter: dict | None
section_splitter: dict | None
sentence_expander: dict | None
sentence_filter: dict | None
sentence_segmenter: dict | None
word_masker: dict | None
class nlpmed_engine.api.models.DuplicateCheckerModel(*, status: str, num_perm: int = 256, sim_threshold: float = 0.9, length_threshold: int = 50)

Bases: ComponentStatusModel

Model for duplicate checking configuration.

Attributes:

num_perm (int): Number of permutations for MinHash. sim_threshold (float): Similarity threshold for duplicates. length_threshold (int): Length threshold for checking duplicates.

length_threshold: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_perm: int
sim_threshold: float
class nlpmed_engine.api.models.EncodingFixerModel(*, status: str)

Bases: ComponentStatusModel

Model for encoding fixer component status.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class nlpmed_engine.api.models.JoinerModel(*, status: str, sentence_delimiter: str = '\n', section_delimiter: str = '\n\n')

Bases: ComponentStatusModel

Model for joining sentences and sections.

Attributes:

sentence_delimiter (str): Delimiter for joining sentences. section_delimiter (str): Delimiter for joining sections.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

section_delimiter: str
sentence_delimiter: str
class nlpmed_engine.api.models.MLInferenceModel(*, status: str, device: str = 'cpu', ml_model_path: str, ml_tokenizer_path: str)

Bases: ComponentStatusModel

Model for machine learning inference settings.

Attributes:

device (str): Device used for model inference (e.g., ‘cpu’, ‘cuda’). ml_model_path (str): Path to the model. ml_tokenizer_path (str): Path to the tokenizer.

device: str
ml_model_path: str
ml_tokenizer_path: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class nlpmed_engine.api.models.MlModelInfo(*, name: str, device: str, max_length: int, loaded: bool, loaded_at: str | None = None)

Bases: BaseModel

device: str
loaded: bool
loaded_at: str | None
max_length: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str
class nlpmed_engine.api.models.MlModelsResponse(*, default_name: str, models: list[MlModelInfo])

Bases: BaseModel

default_name: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

models: list[MlModelInfo]
class nlpmed_engine.api.models.NoteFilterModel(*, status: str, words_to_search: list[str] = <factory>)

Bases: ComponentStatusModel

Model for filtering notes based on keywords.

Attributes:

words_to_search (list[str]): Keywords to search in the notes.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class nlpmed_engine.api.models.NoteModel(*, text: str, sections: list[~nlpmed_engine.api.models.SectionModel] = <factory>, preprocessed_text: str | None = None, predicted_label: str | None = None, predicted_score: float | None = None, note_id: str | None = None)

Bases: BaseModel

Model for representing a note with sections and preprocessed text.

Attributes:

note_id (str): Unique identifier for the note. text (str): Text of the note. sections (list[SectionModel]): List of sections in the note. preprocessed_text (str | None): Preprocessed text of the note. predicted_label (str | None): Predicted label from the model inference. predicted_score (float | None): Predicted score from the model inference.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

note_id: str | None
predicted_label: str | None
predicted_score: float | None
preprocessed_text: str | None
sections: list[SectionModel]
text: str
class nlpmed_engine.api.models.PatientModel(*, patient_id: str, notes: list[NoteModel])

Bases: BaseModel

Model for representing a patient with associated notes.

Attributes:

patient_id (str): Unique identifier for the patient. notes (list[NoteModel]): List of notes associated with the patient.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

notes: list[NoteModel]
patient_id: str
class nlpmed_engine.api.models.PatternReplacerModel(*, status: str, pattern: str = '\\s{4,}', target: str = '\n\n')

Bases: ComponentStatusModel

Model for pattern replacer component with pattern and target replacements.

Attributes:

pattern (str): Regex pattern to replace in the text. target (str): Target string to replace matched pattern.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pattern: str
target: str
class nlpmed_engine.api.models.SectionFilterModel(*, status: str, section_inc_list: list[str] = <factory>, section_exc_list: list[str] = <factory>, fallback: bool = False)

Bases: ComponentStatusModel

Model for filtering sections based on include and exclude keywords.

Attributes:

section_inc_list (list[str]): Keywords for including sections. section_exc_list (list[str]): Keywords for excluding sections. fallback (bool): Enable fallback behavior if no sections match.

fallback: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

section_exc_list: list[str]
section_inc_list: list[str]
class nlpmed_engine.api.models.SectionModel(*, text: str, start_index: int, end_index: int, sentences: list[~nlpmed_engine.api.models.SentenceModel] = <factory>, important_indices: list[int] = <factory>, duplicate_indices: list[int] = <factory>, expanded_indices: list[int] = <factory>, is_important: bool = False)

Bases: BaseModel

Model for representing a section with sentences.

Attributes:

text (str): Text of the section. start_index (int): Start index of the section in the original text. end_index (int): End index of the section in the original text. sentences (list[SentenceModel]): List of sentences in the section. important_indices (list[int]): Indices of important sentences in the section. duplicate_indices (list[int]): Indices of duplicate sentences in the section. is_important (bool): Indicates if the section is marked as important.

duplicate_indices: list[int]
end_index: int
expanded_indices: list[int]
important_indices: list[int]
is_important: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sentences: list[SentenceModel]
start_index: int
text: str
class nlpmed_engine.api.models.SectionSplitterModel(*, status: str, delimiter: str = '\n\n')

Bases: ComponentStatusModel

Model for splitting sections using a delimiter.

Attributes:

delimiter (str): Delimiter used to split sections.

delimiter: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class nlpmed_engine.api.models.SentenceExpanderModel(*, status: str, length_threshold: int = 50)

Bases: ComponentStatusModel

Model for expanding short sentences.

Attributes:

length_threshold (int): Threshold length for expanding short sentences.

length_threshold: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class nlpmed_engine.api.models.SentenceFilterModel(*, status: str, words_to_search: list[str] = <factory>)

Bases: ComponentStatusModel

Model for filtering sentences based on keywords.

Attributes:

words_to_search (list[str]): Keywords to filter sentences.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class nlpmed_engine.api.models.SentenceModel(*, text: str, start_index: int, end_index: int, is_duplicate: bool = False, is_important: bool = False, is_expanded: bool = False)

Bases: BaseModel

Model for representing a sentence with various attributes.

Attributes:

text (str): Text of the sentence. start_index (int): Start index of the sentence in the original text. end_index (int): End index of the sentence in the original text. is_duplicate (bool): Indicates if the sentence is marked as duplicate. is_important (bool): Indicates if the sentence is marked as important. is_expanded (bool): Indicates if the sentence has been expanded.

end_index: int
is_duplicate: bool
is_expanded: bool
is_important: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

start_index: int
text: str
class nlpmed_engine.api.models.SentenceSegmenterModel(*, status: str, nlp_model_name: str = 'en_core_sci_lg', batch_size: int = 10)

Bases: ComponentStatusModel

Model for sentence segmentation settings.

Attributes:

nlp_model_name (str): Name of the model used for sentence segmentation. batch_size (int): Batch size for processing.

batch_size: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

nlp_model_name: str
class nlpmed_engine.api.models.StringInputModel(*, text: str)

Bases: BaseModel

Model for string inputs to be processed.

Attributes:

text (str): Text input to be processed.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text: str
class nlpmed_engine.api.models.TextProcessingResponseModel(*, preprocessed_text: str | None = None, predicted_label: str | None = None, predicted_score: float | None = None, note: NoteModel | None = None)

Bases: BaseModel

Model for responses related to text processing output.

Attributes:

preprocessed_text (str | None): The preprocessed text output. predicted_label (str | None): The predicted label from the model inference. predicted_score (float | None): The prediction score associated with the predicted label. note (NoteModel | None): The note object returned in debug mode.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

note: NoteModel | None
predicted_label: str | None
predicted_score: float | None
preprocessed_text: str | None
class nlpmed_engine.api.models.WordMaskerModel(*, status: str, words_to_mask: list[str] = <factory>, mask_char: str = '*')

Bases: ComponentStatusModel

Model for word masker component with mask settings.

Attributes:

words_to_mask (list[str]): List of words to mask in the text. mask_char (str): Character used for masking.

mask_char: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

words_to_mask: list[str]

nlpmed_engine.api.routes module

Module contents