nlpmed_engine.api package¶
Submodules¶
nlpmed_engine.api.main module¶
Main entry point for the NLPMed API.
This module initializes the FastAPI application for the NLPMed API, sets up CORS middleware, and includes API routes for medical text processing. It also provides a health check endpoint to verify that the API is running.
- Modules:
FastAPI: The main framework used to create the API. CORSMiddleware: Middleware for handling Cross-Origin Resource Sharing (CORS). router: The router module that contains all API routes for NLPMed-Engine.
- Attributes:
app (FastAPI): The main FastAPI application instance.
- Middleware:
CORSMiddleware: Configured to allow requests from any origin, restricted to GET and POST methods without credentials.
- Routes:
router: Includes routes defined in the nlpmed_engine.api.routes.
- Endpoints:
/ (GET): Health check endpoint that returns the status of the API.
- Usage:
Run this module to start the NLPMed API server.
- nlpmed_engine.api.main.health_check() dict ¶
Health check endpoint for the NLPMed API.
This endpoint provides a simple health check to verify that the API is running and accessible. It returns a status message indicating the operational state of the API.
- Returns:
dict: A JSON response containing the status of the API, typically indicating that the API is running.
- Example Response:
- {
“status”: “API is running”
}
- Tags:
Health Check
nlpmed_engine.api.mappers module¶
Mapping functions between Pydantic models and internal data structures.
This module provides a set of mapping functions to convert between Pydantic models used in the API layer and internal data structures used for processing within NLPMed-Engine. These mappers ensure data consistency and facilitate seamless transformation between API inputs/outputs and internal processing logic.
- Functions:
- map_pydantic_to_internal_sentence:
Maps a Pydantic SentenceModel to an internal Sentence object.
- map_internal_to_pydantic_sentence_model:
Maps an internal Sentence object to a Pydantic SentenceModel.
- map_pydantic_to_internal_section:
Maps a Pydantic SectionModel to an internal Section object.
- map_internal_to_pydantic_section_model:
Maps an internal Section object to a Pydantic SectionModel.
- map_pydantic_to_internal_note:
Maps a Pydantic NoteModel to an internal Note object.
- map_internal_to_pydantic_note_model:
Maps an internal Note object to a Pydantic NoteModel.
- map_pydantic_to_internal_patient:
Maps a Pydantic PatientModel to an internal Patient object.
- map_internal_to_pydantic_patient_model:
Maps an internal Patient object to a Pydantic PatientModel.
- nlpmed_engine.api.mappers.map_internal_to_pydantic_note_model(note: Note) NoteModel ¶
Map an internal Note object to a Pydantic NoteModel.
- Args:
note (Note): The internal Note object to be converted.
- Returns:
NoteModel: The corresponding Pydantic model with mapped sections and attributes.
- nlpmed_engine.api.mappers.map_internal_to_pydantic_patient_model(patient: Patient) PatientModel ¶
Map an internal Patient object to a Pydantic PatientModel.
- Args:
patient (Patient): The internal Patient object to be converted.
- Returns:
PatientModel: The corresponding Pydantic model with mapped notes.
- nlpmed_engine.api.mappers.map_internal_to_pydantic_section_model(section: Section) SectionModel ¶
Map an internal Section object to a Pydantic SectionModel.
- Args:
section (Section): The internal Section object to be converted.
- Returns:
SectionModel: The corresponding Pydantic model with mapped sentences and attributes.
- nlpmed_engine.api.mappers.map_internal_to_pydantic_sentence_model(sentence: Sentence) SentenceModel ¶
Map an internal Sentence object to a Pydantic SentenceModel.
- Args:
sentence (Sentence): The internal Sentence object to be converted.
- Returns:
SentenceModel: The corresponding Pydantic model with mapped attributes.
- nlpmed_engine.api.mappers.map_pydantic_to_internal_note(note_model: NoteModel) Note ¶
Map a Pydantic NoteModel to an internal Note object.
- Args:
note_model (NoteModel): The Pydantic model representing a note.
- Returns:
Note: The internal Note object with mapped sections and attributes.
- nlpmed_engine.api.mappers.map_pydantic_to_internal_patient(patient_model: PatientModel) Patient ¶
Map a Pydantic PatientModel to an internal Patient object.
- Args:
patient_model (PatientModel): The Pydantic model representing a patient.
- Returns:
Patient: The internal Patient object with mapped notes.
- nlpmed_engine.api.mappers.map_pydantic_to_internal_section(section_model: SectionModel) Section ¶
Map a Pydantic SectionModel to an internal Section object.
- Args:
section_model (SectionModel): The Pydantic model representing a section.
- Returns:
Section: The internal Section object with mapped sentences and attributes.
- nlpmed_engine.api.mappers.map_pydantic_to_internal_sentence(sentence_model: SentenceModel) Sentence ¶
Map a Pydantic SentenceModel to an internal Sentence object.
- Args:
sentence_model (SentenceModel): The Pydantic model representing a sentence.
- Returns:
Sentence: The internal Sentence object with the corresponding attributes.
nlpmed_engine.api.models module¶
Pydantic models for NLPMed-Engine API.
This module defines the Pydantic models used for validating and managing the input, output, and configuration data for the NLPMed-Engine. The models provide a structured representation of various components involved in text processing, including sentences, sections, notes, patients, and different text processing components.
- Classes:
- StringInputModel:
Model for string inputs to be processed.
- ComponentStatusModel:
Base model for component status.
- EncodingFixerModel:
Model for encoding fixer component status.
- PatternReplacerModel:
Model for pattern replacer component with pattern and target replacements.
- WordMaskerModel:
Model for word masker component with mask settings.
- NoteFilterModel:
Model for filtering notes based on keywords.
- SectionSplitterModel:
Model for splitting sections using a delimiter.
- SectionFilterModel:
Model for filtering sections based on include and exclude keywords.
- SentenceSegmenterModel:
Model for sentence segmentation settings.
- DuplicateCheckerModel:
Model for duplicate checking configuration.
- SentenceFilterModel:
Model for filtering sentences based on keywords.
- SentenceExpanderModel:
Model for expanding short sentences.
- JoinerModel:
Model for joining sentences and sections.
- MLInferenceModel:
Model for machine learning inference settings.
- ConfigModel:
Configuration model for all text processing components.
- SentenceModel:
Model for representing a sentence with various attributes.
- SectionModel:
Model for representing a section with sentences.
- NoteModel:
Model for representing a note with sections and preprocessed text.
- PatientModel:
Model for representing a patient with associated notes.
- TextProcessingResponseModel:
Model for responses related to text processing output.
- class nlpmed_engine.api.models.ComponentStatusModel(*, status: str)¶
Bases:
BaseModel
Base model for component status.
- Attributes:
status (str): Status of the component (‘enabled’, ‘disabled’, ‘excluded’).
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- status: str¶
- class nlpmed_engine.api.models.ConfigModel(*, encoding_fixer: dict | None = None, pattern_replacer: dict | None = None, word_masker: dict | None = None, note_filter: dict | None = None, section_splitter: dict | None = None, section_filter: dict | None = None, sentence_segmenter: dict | None = None, duplicate_checker: dict | None = None, sentence_filter: dict | None = None, sentence_expander: dict | None = None, joiner: dict | None = None, ml_inference: dict | None = None, debug: bool = False)¶
Bases:
BaseModel
Configuration model for all text processing components.
- Attributes:
encoding_fixer (dict | None): Configuration for the encoding fixer component. pattern_replacer (dict | None): Configuration for the pattern replacer component. word_masker (dict | None): Configuration for the word masker component. note_filter (dict | None): Configuration for the note filter component. section_splitter (dict | None): Configuration for the section splitter component. section_filter (dict | None): Configuration for the section filter component. sentence_segmenter (dict | None): Configuration for the sentence segmenter component. duplicate_checker (dict | None): Configuration for the duplicate checker component. sentence_filter (dict | None): Configuration for the sentence filter component. sentence_expander (dict | None): Configuration for the sentence expander component. joiner (dict | None): Configuration for the joiner component. ml_inference (dict | None): Configuration for the machine learning inference component.
- debug: bool¶
- duplicate_checker: dict | None¶
- encoding_fixer: dict | None¶
- joiner: dict | None¶
- ml_inference: dict | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- note_filter: dict | None¶
- pattern_replacer: dict | None¶
- section_filter: dict | None¶
- section_splitter: dict | None¶
- sentence_expander: dict | None¶
- sentence_filter: dict | None¶
- sentence_segmenter: dict | None¶
- word_masker: dict | None¶
- class nlpmed_engine.api.models.DuplicateCheckerModel(*, status: str, num_perm: int = 256, sim_threshold: float = 0.9, length_threshold: int = 50)¶
Bases:
ComponentStatusModel
Model for duplicate checking configuration.
- Attributes:
num_perm (int): Number of permutations for MinHash. sim_threshold (float): Similarity threshold for duplicates. length_threshold (int): Length threshold for checking duplicates.
- length_threshold: int¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- num_perm: int¶
- sim_threshold: float¶
- class nlpmed_engine.api.models.EncodingFixerModel(*, status: str)¶
Bases:
ComponentStatusModel
Model for encoding fixer component status.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class nlpmed_engine.api.models.JoinerModel(*, status: str, sentence_delimiter: str = '\n', section_delimiter: str = '\n\n')¶
Bases:
ComponentStatusModel
Model for joining sentences and sections.
- Attributes:
sentence_delimiter (str): Delimiter for joining sentences. section_delimiter (str): Delimiter for joining sections.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- section_delimiter: str¶
- sentence_delimiter: str¶
- class nlpmed_engine.api.models.MLInferenceModel(*, status: str, device: str = 'cpu', ml_model_path: str, ml_tokenizer_path: str)¶
Bases:
ComponentStatusModel
Model for machine learning inference settings.
- Attributes:
device (str): Device used for model inference (e.g., ‘cpu’, ‘cuda’). ml_model_path (str): Path to the model. ml_tokenizer_path (str): Path to the tokenizer.
- device: str¶
- ml_model_path: str¶
- ml_tokenizer_path: str¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class nlpmed_engine.api.models.NoteFilterModel(*, status: str, words_to_search: list[str] = <factory>)¶
Bases:
ComponentStatusModel
Model for filtering notes based on keywords.
- Attributes:
words_to_search (list[str]): Keywords to search in the notes.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- words_to_search: list[str]¶
- class nlpmed_engine.api.models.NoteModel(*, text: str, sections: list[~nlpmed_engine.api.models.SectionModel] = <factory>, preprocessed_text: str | None = None, predicted_label: str | None = None, predicted_score: float | None = None)¶
Bases:
BaseModel
Model for representing a note with sections and preprocessed text.
- Attributes:
text (str): Text of the note. sections (list[SectionModel]): List of sections in the note. preprocessed_text (str | None): Preprocessed text of the note. predicted_label (str | None): Predicted label from the model inference. predicted_score (float | None): Predicted score from the model inference.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- predicted_label: str | None¶
- predicted_score: float | None¶
- preprocessed_text: str | None¶
- sections: list[SectionModel]¶
- text: str¶
- class nlpmed_engine.api.models.PatientModel(*, patient_id: str, notes: list[NoteModel])¶
Bases:
BaseModel
Model for representing a patient with associated notes.
- Attributes:
patient_id (str): Unique identifier for the patient. notes (list[NoteModel]): List of notes associated with the patient.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- patient_id: str¶
- class nlpmed_engine.api.models.PatternReplacerModel(*, status: str, pattern: str = '\\s{4,}', target: str = '\n\n')¶
Bases:
ComponentStatusModel
Model for pattern replacer component with pattern and target replacements.
- Attributes:
pattern (str): Regex pattern to replace in the text. target (str): Target string to replace matched pattern.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- pattern: str¶
- target: str¶
- class nlpmed_engine.api.models.SectionFilterModel(*, status: str, section_inc_list: list[str] = <factory>, section_exc_list: list[str] = <factory>, fallback: bool = False)¶
Bases:
ComponentStatusModel
Model for filtering sections based on include and exclude keywords.
- Attributes:
section_inc_list (list[str]): Keywords for including sections. section_exc_list (list[str]): Keywords for excluding sections. fallback (bool): Enable fallback behavior if no sections match.
- fallback: bool¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- section_exc_list: list[str]¶
- section_inc_list: list[str]¶
- class nlpmed_engine.api.models.SectionModel(*, text: str, start_index: int, end_index: int, sentences: list[~nlpmed_engine.api.models.SentenceModel] = <factory>, important_indices: list[int] = <factory>, duplicate_indices: list[int] = <factory>, expanded_indices: list[int] = <factory>, is_important: bool = False)¶
Bases:
BaseModel
Model for representing a section with sentences.
- Attributes:
text (str): Text of the section. start_index (int): Start index of the section in the original text. end_index (int): End index of the section in the original text. sentences (list[SentenceModel]): List of sentences in the section. important_indices (list[int]): Indices of important sentences in the section. duplicate_indices (list[int]): Indices of duplicate sentences in the section. is_important (bool): Indicates if the section is marked as important.
- duplicate_indices: list[int]¶
- end_index: int¶
- expanded_indices: list[int]¶
- important_indices: list[int]¶
- is_important: bool¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- sentences: list[SentenceModel]¶
- start_index: int¶
- text: str¶
- class nlpmed_engine.api.models.SectionSplitterModel(*, status: str, delimiter: str = '\n\n')¶
Bases:
ComponentStatusModel
Model for splitting sections using a delimiter.
- Attributes:
delimiter (str): Delimiter used to split sections.
- delimiter: str¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class nlpmed_engine.api.models.SentenceExpanderModel(*, status: str, length_threshold: int = 50)¶
Bases:
ComponentStatusModel
Model for expanding short sentences.
- Attributes:
length_threshold (int): Threshold length for expanding short sentences.
- length_threshold: int¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class nlpmed_engine.api.models.SentenceFilterModel(*, status: str, words_to_search: list[str] = <factory>)¶
Bases:
ComponentStatusModel
Model for filtering sentences based on keywords.
- Attributes:
words_to_search (list[str]): Keywords to filter sentences.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- words_to_search: list[str]¶
- class nlpmed_engine.api.models.SentenceModel(*, text: str, start_index: int, end_index: int, is_duplicate: bool = False, is_important: bool = False, is_expanded: bool = False)¶
Bases:
BaseModel
Model for representing a sentence with various attributes.
- Attributes:
text (str): Text of the sentence. start_index (int): Start index of the sentence in the original text. end_index (int): End index of the sentence in the original text. is_duplicate (bool): Indicates if the sentence is marked as duplicate. is_important (bool): Indicates if the sentence is marked as important. is_expanded (bool): Indicates if the sentence has been expanded.
- end_index: int¶
- is_duplicate: bool¶
- is_expanded: bool¶
- is_important: bool¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- start_index: int¶
- text: str¶
- class nlpmed_engine.api.models.SentenceSegmenterModel(*, status: str, nlp_model_name: str = 'en_core_sci_lg', batch_size: int = 10)¶
Bases:
ComponentStatusModel
Model for sentence segmentation settings.
- Attributes:
nlp_model_name (str): Name of the model used for sentence segmentation. batch_size (int): Batch size for processing.
- batch_size: int¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- nlp_model_name: str¶
- class nlpmed_engine.api.models.StringInputModel(*, text: str)¶
Bases:
BaseModel
Model for string inputs to be processed.
- Attributes:
text (str): Text input to be processed.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text: str¶
- class nlpmed_engine.api.models.TextProcessingResponseModel(*, preprocessed_text: str | None = None, predicted_label: str | None = None, predicted_score: float | None = None, note: NoteModel | None = None)¶
Bases:
BaseModel
Model for responses related to text processing output.
- Attributes:
preprocessed_text (str | None): The preprocessed text output. predicted_label (str | None): The predicted label from the model inference. predicted_score (float | None): The prediction score associated with the predicted label. note (NoteModel | None): The note object returned in debug mode.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- predicted_label: str | None¶
- predicted_score: float | None¶
- preprocessed_text: str | None¶
- class nlpmed_engine.api.models.WordMaskerModel(*, status: str, words_to_mask: list[str] = <factory>, mask_char: str = '*')¶
Bases:
ComponentStatusModel
Model for word masker component with mask settings.
- Attributes:
words_to_mask (list[str]): List of words to mask in the text. mask_char (str): Character used for masking.
- mask_char: str¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- words_to_mask: list[str]¶
nlpmed_engine.api.routes module¶
API routes for NLPMed-Engine.
This module defines the API endpoints for processing medical text data using NLPMed-Engine’s pipelines. The endpoints handle various tasks, including processing individual patients, batch processing of patients, and processing standalone text inputs. Each route leverages Pydantic models to validate inputs and outputs, ensuring data integrity and consistency.
- Modules:
os: Standard library module for accessing environment variables. typing: Provides type annotations for enhanced code readability. fastapi: The main framework for creating API routes and managing dependencies.
- Dependencies:
SinglePipeline: Configured instance of the single processing pipeline. BatchPipeline: Configured instance of the batch processing pipeline.
- Functions:
get_single_pipeline: Dependency that returns the single pipeline instance. get_batch_pipeline: Dependency that returns the batch pipeline instance.
- Routes:
- /process_patient (POST):
Processes a single patient with the specified configuration.
- /process_batch_patients (POST):
Processes a batch of patients with the specified configuration.
- /process_text (POST):
Processes a standalone text input with the specified configuration.
- nlpmed_engine.api.routes.get_batch_pipeline() BatchPipeline ¶
Provides the singleton instance of the BatchPipeline.
- Returns:
BatchPipeline: The configured instance of the BatchPipeline for batch processing.
- nlpmed_engine.api.routes.get_single_pipeline() SinglePipeline ¶
Provides the singleton instance of the SinglePipeline.
- Returns:
SinglePipeline: The configured instance of the SinglePipeline for processing.
- nlpmed_engine.api.routes.process_batch_patients(patients: list[PatientModel], config: ConfigModel, pipeline: Annotated[BatchPipeline, Depends(get_batch_pipeline)]) list[PatientModel] ¶
Processes a batch of patients using the specified configuration and batch pipeline.
- Args:
patients (list[PatientModel]): A list of patient data to be processed. config (ConfigModel): The configuration settings for the batch processing pipeline. pipeline (BatchPipeline): The pipeline instance for batch processing.
- Returns:
list[PatientModel]: A list of processed patient data.
- Raises:
HTTPException: If an error occurs during the batch processing of patients.
- nlpmed_engine.api.routes.process_patient(patient: PatientModel, config: ConfigModel, pipeline: Annotated[SinglePipeline, Depends(get_single_pipeline)]) PatientModel ¶
Processes a single patient using the specified configuration and pipeline.
- Args:
patient (PatientModel): The patient data to be processed. config (ConfigModel): The configuration settings for the processing pipeline. pipeline (SinglePipeline): The pipeline instance for processing the patient.
- Returns:
PatientModel: The processed patient data.
- Raises:
HTTPException: If an error occurs during the processing of the patient.
- nlpmed_engine.api.routes.process_text(input_data: StringInputModel, config: ConfigModel, pipeline: Annotated[SinglePipeline, Depends(get_single_pipeline)]) TextProcessingResponseModel | NoteModel ¶
Processes a standalone text input using the specified configuration and pipeline.
- Args:
input_data (StringInputModel): The text input data to be processed. config (ConfigModel): The configuration settings for the processing pipeline. pipeline (SinglePipeline): The pipeline instance for processing the text input.
- Returns:
TextProcessingResponseModel: The response containing preprocessed text, predicted label, and predicted score, with additional NoteModel if debug is True.
- Raises:
HTTPException: If an error occurs during the processing of the text input.