Skip to content

Commit

Permalink
FHIR ValueSet(s) should enable wider support for CDC Pilot sites to u…
Browse files Browse the repository at this point in the history
…se other codes for PCR tests, results, etc. (#12)

* COVID Diagnosis
formal ValueSet for ICD10 : U07.1

* COVID Diagnosis
optional support for SNOMED, in the rare case that ICD10 is not available.
SNOMED also reports for *secondary morbidities caused by COVID*, such as PNA or resp infections, etc.

* COVID PCR test LOINC test codes
qualitiative and quantitative results.
Requires associated "interpretation" of PCR result

* Valusets for lab results POSITIVE or NEGATIVE.
Includes synonyms like "identified", "presence", "weakly-reactive", etc.

* covid_define.sql Diagnosis (DX) defers to covid_symptom__define_dx using define_dx_icd10.sql. Some sites may require instead using SNOMED, which is define_dx_snomed.sql. Note that this SNOMED is not the default because our papers are written with the billing ICD10 diagnosis of U07.1.

* removed any mention of Influenza, potentially confusing/distracting to anyone reading this repository.

* BCH specific age is pediatric, other CDC pilot sites are general population age. Split age definitions into

define_age_pediatric.sql
define_age_general.sql

to use one or the other, toggle the correct target in
manifest.toml

* define PCR tests and results are now externalized to
define_pcr_negative.sql
define_pcr_positive.sql

todo: need to workaround
"POSITIVE" vs "Positive"
"NEGATIVE" vs "Negative"
i2b2 specific observations that are not translated in transit by ETL
smart-on-fhir/cumulus-etl#231

* removed covid_symptom__define_pcr_result
see instead negative and positive SQL definitions (seperate files now)

* simplified covid_symptom__define_symptom_cui to
select distinct cui,pref from covid_symptom__define_symptom

* renamed : removed superfluous study prefix from filenaames, already in the correct folder.

* covid_define.sql was deleted, partitioned into individual SQL files

* covid_symptom__site_pcr is now found in
define_pcr.sql

* renamed covid_define.py to typesystem.py

This file is depended on by other repos for COVID analysis.

* renamed covid_define_symptom.*

* site_define.sql is now deprecated, see instead each "define_***.sql" file.

* count tables are not generated dynamically.

* covid counts now generated dynamically.
only the "table" of COVID data is prepared by hand to support study objectives.

* updated manifest.toml to include new "define" SQL.

updated pyproject.toml to include latest (core) "cumulus-library"

* changed default to GENERAL population, not BCH specific pediatric
#9

* comments only to reference ISSUES open as TODO

* fixed missing comma ","

* manifest.toml restored define_study_period.sql

* removed 2016 retroactive period.
use instead covid_symptom__define_period if modifications are needed to the covid study period.

#11

* renamed covid_symptom__symptom_nlp

removed "influenza" backdating in symptom comparison.

* towards better support of PCR tests and results in different coding systems and EHR support.

#8
#13

* renamed covid_symptom__define_dx_icd10

towards better support of PCR tests and results in different coding systems and EHR support.

#8
#13

* define_ed_note.sql removed "ED Social Work" note type, only ED Note types are supported for COVID symptoms analysis.

* "Observation Interpretation"
added standard POS/NEG FHIR ValueSet

* SNOMED or ICD10 can be used to build DX table_dx.sql

* COVID study period can optionally be extended to before COVID

('before-covid', date('2016-06-01'), date('2020-02-29')),

* MILESTONE: manifest.toml updated and "tested" by executing each SQL one at a time in Athena.

table_prevalence_ed.sql now references renamed
covid_symptom__symptom_nlp

* define_ed_note.sql
fixed column names
AS t (from_system, from_code, from_display, system, code, display);

* table_dx.sql

covid_symptom__dx now contains

    c.recorded_month as cond_month,
    c.recorded_year as cond_year,

* fixed minor issues in counts.py --> counts.sql

* removed start_year counts query
  • Loading branch information
comorbidity authored Jul 5, 2023
1 parent 31e0bde commit f39ebfa
Show file tree
Hide file tree
Showing 35 changed files with 1,503 additions and 698 deletions.
112 changes: 112 additions & 0 deletions cumulus_library_covid/covid_symptom/count.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
from typing import List
from cumulus_library.schema import counts

def table(tablename: str, duration=None, study_prefix='covid_symptom') -> str:
if duration:
return f'{study_prefix}__{tablename}_{duration}'
else:
return f'{study_prefix}__{tablename}'

def count_dx(duration='week'):
"""
covid_symptom__count_dx_week
covid_symptom__count_dx_month
"""
view_name = table('count_dx', duration)
from_table = table('dx')
cols = [f'cond_{duration}',
'enc_class_code',
'age_at_visit',
'ed_note',
'variant_era']
return counts.count_encounter(view_name, from_table, cols)

def count_pcr(duration='week'):
"""
covid_symptom__count_pcr_week
covid_symptom__count_pcr_month
"""
view_name = table('count_pcr', duration)
from_table = table('pcr')
cols = [f'covid_pcr_{duration}',
'covid_pcr_result_display',
'variant_era',
'ed_note',
'age_at_visit',
'gender',
'race_display']
return counts.count_encounter(view_name, from_table, cols)

def count_study_period(duration='month'):
"""
covid_symptom__count_study_period_week
covid_symptom__count_study_period_month
covid_symptom__count_study_period_year
"""
view_name = table('count_study_period', duration)
from_table = table('study_period')
cols = [f'start_{duration}',
'variant_era', 'ed_note',
'gender', 'age_group', 'race_display']
return counts.count_encounter(view_name, from_table, cols)

def count_prevalence_ed(duration='month'):
view_name = table('count_prevalence_ed', duration)
from_table = table('prevalence_ed')
cols = [
f'author_{duration}',
'covid_dx',
'covid_icd10',
'covid_pcr_result',
'covid_symptom',
'symptom_icd10_display',
'variant_era',
'age_group']
return counts.count_encounter(view_name, from_table, cols)

def count_symptom(duration='week'):
"""
covid_symptom__count_symptom_week
covid_symptom__count_symptom_month
"""
view_name = table('count_symptom', duration)
from_table = table('symptom')
cols = [f'author_{duration}',
'symptom_display',
'variant_era',
'age_group',
'gender',
'race_display',
'enc_class_code',
'ed_note']
return counts.count_encounter(view_name, from_table, cols)

def concat_view_sql(create_view_list: List[str]) -> str:
"""
:param create_view_list: SQL prepared statements
"""
seperator = '-- ###########################################################'
concat = list()

for create_view in create_view_list:
concat.append(seperator + '\n' + create_view + '\n')

return '\n'.join(concat)

def write_view_sql(view_list_sql: List[str], filename='count.sql') -> None:
"""
:param view_list_sql: SQL prepared statements
:param filename: path to output file, default 'count.sql' in PWD
"""
with open(filename, 'w') as fout:
fout.write(concat_view_sql(view_list_sql))


if __name__ == '__main__':

write_view_sql([
count_dx('week'),
count_dx('month'),
count_pcr('week'),
count_pcr('month'),
count_study_period('week'), count_study_period('month')])
113 changes: 113 additions & 0 deletions cumulus_library_covid/covid_symptom/count.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
-- ###########################################################
CREATE or replace VIEW covid_symptom__count_dx_week AS
with powerset as
(
select
count(distinct subject_ref) as cnt_subject
, count(distinct encounter_ref) as cnt_encounter
, cond_week, enc_class_code, age_at_visit, ed_note, variant_era
FROM covid_symptom__dx
group by CUBE
( cond_week, enc_class_code, age_at_visit, ed_note, variant_era )
)
select
cnt_encounter as cnt
, cond_week, enc_class_code, age_at_visit, ed_note, variant_era
from powerset
WHERE cnt_subject >= 10
ORDER BY cnt desc;

-- ###########################################################
CREATE or replace VIEW covid_symptom__count_dx_month AS
with powerset as
(
select
count(distinct subject_ref) as cnt_subject
, count(distinct encounter_ref) as cnt_encounter
, cond_month, enc_class_code, age_at_visit, ed_note, variant_era
FROM covid_symptom__dx
group by CUBE
( cond_month, enc_class_code, age_at_visit, ed_note, variant_era )
)
select
cnt_encounter as cnt
, cond_month, enc_class_code, age_at_visit, ed_note, variant_era
from powerset
WHERE cnt_subject >= 10
ORDER BY cnt desc;

-- ###########################################################
CREATE or replace VIEW covid_symptom__count_pcr_week AS
with powerset as
(
select
count(distinct subject_ref) as cnt_subject
, count(distinct encounter_ref) as cnt_encounter
, covid_pcr_week, covid_pcr_result_display, variant_era, ed_note, age_at_visit, gender, race_display
FROM covid_symptom__pcr
group by CUBE
( covid_pcr_week, covid_pcr_result_display, variant_era, ed_note, age_at_visit, gender, race_display )
)
select
cnt_encounter as cnt
, covid_pcr_week, covid_pcr_result_display, variant_era, ed_note, age_at_visit, gender, race_display
from powerset
WHERE cnt_subject >= 10
ORDER BY cnt desc;

-- ###########################################################
CREATE or replace VIEW covid_symptom__count_pcr_month AS
with powerset as
(
select
count(distinct subject_ref) as cnt_subject
, count(distinct encounter_ref) as cnt_encounter
, covid_pcr_month, covid_pcr_result_display, variant_era, ed_note, age_at_visit, gender, race_display
FROM covid_symptom__pcr
group by CUBE
( covid_pcr_month, covid_pcr_result_display, variant_era, ed_note, age_at_visit, gender, race_display )
)
select
cnt_encounter as cnt
, covid_pcr_month, covid_pcr_result_display, variant_era, ed_note, age_at_visit, gender, race_display
from powerset
WHERE cnt_subject >= 10
ORDER BY cnt desc;

-- ###########################################################
CREATE or replace VIEW covid_symptom__count_study_period_week AS
with powerset as
(
select
count(distinct subject_ref) as cnt_subject
, count(distinct encounter_ref) as cnt_encounter
, start_week, variant_era, ed_note, gender, age_group, race_display
FROM covid_symptom__study_period
group by CUBE
( start_week, variant_era, ed_note, gender, age_group, race_display )
)
select
cnt_encounter as cnt
, start_week, variant_era, ed_note, gender, age_group, race_display
from powerset
WHERE cnt_subject >= 10
ORDER BY cnt desc;

-- ###########################################################
CREATE or replace VIEW covid_symptom__count_study_period_month AS
with powerset as
(
select
count(distinct subject_ref) as cnt_subject
, count(distinct encounter_ref) as cnt_encounter
, start_month, variant_era, ed_note, gender, age_group, race_display
FROM covid_symptom__study_period
group by CUBE
( start_month, variant_era, ed_note, gender, age_group, race_display )
)
select
cnt_encounter as cnt
, start_month, variant_era, ed_note, gender, age_group, race_display
from powerset
WHERE cnt_subject >= 10
ORDER BY cnt desc;
19 changes: 0 additions & 19 deletions cumulus_library_covid/covid_symptom/count_covid.py

This file was deleted.

50 changes: 0 additions & 50 deletions cumulus_library_covid/covid_symptom/count_covid_dx.sql

This file was deleted.

84 changes: 0 additions & 84 deletions cumulus_library_covid/covid_symptom/count_covid_pcr.sql

This file was deleted.

Loading

0 comments on commit f39ebfa

Please sign in to comment.