Merative Annotator for Clinical Data Container Edition

Cancer (Deprecated)

Detects potential cancer disease terms such as adenocarcinoma carcinomatosis. Extra features that can be found by the annotator include: the actual name of the cancer, measurement, cancer grade, site, date, and modality.


  • umls.latest
  • umls.2022AA
  • umls.2021AA
  • umls.2020AA (deprecated - will be removed in 2023)
Defines the version of the UMLS library that is used when analyzing unstructured data.

The value umls.latest will reference the latest available version of UMLS within the service. As newer versions of UMLS are made available in the service, umls.latest library configurations will automatically leverage the latest available version of UMLS in the service once available. Declaration of a specific version of UMLS is recommended to avoid undesirable changes in output as newer versions of UMLS are made available within the service. Through declaration of a specific version of UMLS, newer versions of UMLS may be evaluated prior to use in production.

Annotation Types

  • aci.IcaCancerDiagnosisInd


beginThe start position of the annotation as a character offset into the text. The smallest possible start position is 0.
endThe end position of the annotation as character offset into the text. The end position points at the first character after the annotation, such that end-begin equals the length of the coveredText.
coveredTextThe text covered by an annotation as a string.
modalityPotenial values are: positive and negative. This based on whether the patient has or does not have the cancer identified.
sectionSurfaceFormMedical documents have many sections such as patient’s information, previous medical history, family history, etc. The covered text that identifies which section of the document that spans the annotation. The default value of this feature is document.
cancerSee aci.Cancer table below.
dateSee aci.Date table below.
measurementSee aci.Measurement table below.
CancerGradeSee aci.CancerGrade table below.
siteSee aci.SiteInd table below.

Subtypes for aci.IcaCancerDiagnosisInd


beginThe start position of the annotation as a character offset into the text. The smallest possible start position is 0.
endThe end position of the annotation as character offset into the text. The end position points at the first character after the annotation, such that end-begin equals the length of the coveredText.
coveredTextThe text covered by an annotation as a string.
coveredTextThe text covered by an annotation as a string.
cancerSurfaceFormCovered text that represents the cancer. For example, in the text He has lung cancer, the cancerSurfaceForm is lung cancer.
cancerNormalizedNameNormalized name for the cancer from the UMLS dictionary For example, in the text He has lung cancer, the cancerSurfaceForm is primary malignant neoplasm of lung.
ccsCodeCCS stands for Clinical Classification System, used to categorize diagnosis and procedures such that it can be used for further analysis.
hccCodeHCC stands for Hierarchical Condition Categories and primarily used by Medicare and Medicaid.
loincIdLOINC stands for Logical Observations Identifiers, Names, Codes. The value for this feature comes from UMLS.
nciCodeThe NCI Thesaurus covers vocabulary for cancer-related clinical care, translational and basic research, and public information and administrative activities. The value for this feature comes from UMLS.
meshIdThe MeSH thesaurus is a controlled vocabulary used for indexing, cataloging, and searching for biomedical and health-related information and documents. The value for this feature comes from UMLS.
icd9CodeICD stands for International Classification of Diseases. The number 9 is a revision number for this code set.
icd10CodeICD stands for International Classification of Diseases. The number 10 is a revision number for this code set.
snomedConceptIdNumerical code provided by the SNOMED dictionaries that represents the cancer.
cuiUMLS Concept Unique ID (CUI). CUIs are used to uniquely identify concepts across different UMLS sources. Depending on the source of the cancer information, this value may not be available.
morphologyCodeA value that describes the behavior of cancer from malignant to benign.
behaviorThe code represents the type of growth such as benign, malignant, in situ, or uncertain. This code only applies to cancer related disease. See behavior code below.
behaviorSourceA code that will either come from morphology code, icd 9 code, or icd 10 code.


beginThe start position of the annotation as a character offset into the text. The smallest possible start position is 0.
endThe end position of the annotation as character offset into the text. The end position points at the first character after the annotation, such that end-begin equals the length of the coveredText.
coveredTextThe text covered by an annotation as a string.
coveredTextThe text covered by an annotation as a string.
dateInMillisecondsIt is a java.util.Calendar date and is the difference, measured in milliseconds, between the date of the event and midnight, January 1, 1970 UTC.


beginThe start position of the annotation as a character offset into the text. The smallest possible start position is 0.
endThe end position of the annotation as character offset into the text. The end position points at the first character after the annotation, such that end-begin equals the length of the coveredText.
coveredTextThe text covered by an annotation as a string.
coveredTextThe text covered by an annotation as a string.
dimensionType of mesurement. For example, in the text 4.3mm tumor, the dimension of measurement is length.


beginThe start position of the annotation as a character offset into the text. The smallest possible start position is 0.
endThe end position of the annotation as character offset into the text. The end position points at the first character after the annotation, such that end-begin equals the length of the coveredText.
coveredTextThe text covered by an annotation as a string.
gradeValueThe value of the grade.


beginThe start position of the annotation as a character offset into the text. The smallest possible start position is 0.
endThe end position of the annotation as character offset into the text. The end position points at the first character after the annotation, such that end-begin equals the length of the coveredText.
coveredTextThe text covered by an annotation as a string.
coveredTextThe text covered by an annotation as a string.
gradeValue<The value of the grade.
siteNormalizedNameThe normalized name for the site from UMLS.
compoundWhether this a multi-site term.
nomedConceptIdNumerical code provided by the SNOMED dictionaries that represents the site.

Behavior Codes

1Unknown (uncertain if benign or malignant)
3Malignant (primary)
6Malignant (metastatic or secondary site)
9Malignant (uncertain if primary or metastatic)

Sample Response

Sample response from the cancer annotator for the text: She was previously treated for adenocarcinoma of the colon.

"unstructured": [
"text": "She was previously treated for adenocarcinoma of the colon.",
"data": {
"IcaCancerDiagnosisInd": [
"type": "aci.IcaCancerDiagnosisInd",
"begin": 31,