The Children's Learning Institute (CLI) at UTHealth Houston is seeking a Data Scientist to support our Texas State Initiatives (TSI); a portfolio of large-scale, statewide programs at the intersection of advanced analytics, applied education research, and emerging AI/ML technologies. Our work centers on state-level educational data spanning a range of large-scale statewide systems, including CLI Engage, our instructional delivery and assessment platform, and TECPDS, the Texas Early Childhood Professional Development System, which serves as the statewide workforce registry tracking educator credentials, training, and professional development. Together these systems generate large, heterogeneous datasets linking educator characteristics, instructional practice, and child outcomes across the state of Texas. This work involves close collaboration with State Education Agencies (SEAs), Local Education Agencies (LEAs), and other stakeholders to generate rigorous, policy-relevant insights that directly shape how Texas supports its early childhood workforce and the children they serve. Supported by long-standing state contracts, this position offers the stability of sustained, mission-driven work at the intersection of research and public policy.
The successful candidate will engage in analytically rich and methodologically varied work across a range of research and evaluation questions, including randomized controlled trials and quasi-experimental studies. This includes multilevel and growth modeling, psychometric and IRT analyses, structural equation modeling, latent variable and mixture approaches, and classification methods, among others. The candidate should be prepared to work fluidly across these methods as research and evaluation questions evolve. The datasets involved are large, multi-source, and frequently updated, with nested and longitudinal structures, requiring both statistical sophistication and strong data engineering instincts. Much of this work has direct policy and scientific implications, and findings must be translated into compelling visual products including dashboards, interactive reports, data visualizations, and graphical summaries for state agency partners and program staff, as well as scholarly and technical products.
TSI is an active and growing area of applied AI and machine learning work hosted on AWS. The Data Scientist is expected to meaningfully engage with AI/ML workflows as they relate to analytic and research goals. This includes contributing to predictive modeling and classification for large-scale assessment and workforce data, NLP-based approaches to instructional quality monitoring and document processing, AI-assisted automation of reporting and survey analysis, and generative AI applications supporting educators and program staff. Intellectual curiosity about these methods and a willingness to learn and apply them in service of applied research and operational analytics are essential.
In addition to the core analytic work, the Data Scientist will support both recurring, large-scale reporting (annual, quarterly) and higher-frequency deliverables (e.g., weekly or near-real-time monitoring reports), with the expectation that analytic workflows will be built and maintained to be efficient, reproducible, and responsive to programmatic change. This position will work closely with a faculty manager, who will provide substantive and methodological guidance as needed. Proficiency in SAS is crucial, as many existing data pipelines and analyses are implemented and documented in SAS, and the candidate should be able to understand, maintain, and adapt legacy code including SAS macros. Proficiency in R or Python is also required. Opportunities exist to contribute to a range of scholarly and technical products commensurate with interest and contribution, including peer-reviewed publications, grant applications, technical reports and white papers, methods and software documentation, Shiny applications, and open-source analytic tools hosted on GitHub.
What we do here changes the world. UTHealth Houston is Texas’ resource for healthcare education, innovation, scientific discovery, and excellence in patient care. That’s where you come in.
Once you join us you won't want to leave. It’s because we reward our team for the excellent service they provide. Our total rewards package includes the benefits you’d expect from a top healthcare organization (benefits, insurance, etc.), plus:
- 100% paid medical premiums for our full-time employees
- Generous time off (holidays, preventative leave day, both vacation and sick time – all of which equates to around 37-38 days per year)
- The longer you stay, the more vacation you’ll accrue!
- Longevity Pay (Monthly payments after two years of service)
- Build your future with our awesome retirement/pension plan!
We take care of our employees! As a world-renowned institution, our employees’ wellbeing is important to us. We offer work/life services such as...
- Free financial and legal counseling
- Free mental health counseling services
- Gym membership discounts and access to wellness programs
- Other employee discounts including entertainment, car rentals, cell phones, etc.
- Resources for child and elder care
- Plus many more!
Position Summary:
The Data Scientist – Level II is a skillful team player with a clear track record of transforming data into actionable insights and knowledge. This individual is a deliberate and systematic scientist who can clearly articulate the hypotheses to be tested, practice with rigor and precision, and maintain a balanced/independent point of view to achieve valid and accepted findings. Uses social determinants of health data sets in conjunction with administrative claims and electronic health records to measure the relational impact of these variables when applicable.
It is essential for this person to be a very effective communicator who can guide business and clinical customers to understanding of applied methods, rationale, statistical validity and the appropriate application of insights.
Position Key Accountabilities:
1. Supports faculty, business and research customers with data science services on a variety of projects which may include population health, education research, workforce analytics, program evaluation, applied statistics, personalized medicine and translational research.
2. Builds models that incorporate diverse data sets representing administrative, clinical and social health indicators when applicable.
3. Apply a variety of methods such as statistical analysis, data mining, predictive analytics, time series analysis, multivariate regression analysis, statistical process control and optimization solutions.
4. Perform as a results-oriented problem solver to quickly synthesize complex scenarios, connect abstract patterns/information, apply appropriate methods, and implement practical and timely solutions.
5. Demonstrate and embrace a collaborative approach to fully realize the talents of colleagues such as data stewards, IT staff, business analysts, business stakeholders and industry experts.
6. Assess the impact of computer networking and data warehousing on data use and analysis when applicable.
7. Designs and produces high-quality data visualizations, dashboards, graphical reports, and other visual products that communicate complex findings clearly and effectively to both technical and non-technical audiences including state agency partners and policymakers.
8. Perform other duties as assigned.
Certification/Skills:
Proficiency with statistical packages, data bases and programming languages for data preparation, storage, transformation, analysis or visualization. Examples include R, SAS, STATA, SPSS, Matlab, SQL, Python, Hadoop, Pig, Hive, MapReduce, Java Tableau, Excel or other big data frameworks.
Ability to work with a wide variety of data formats/sources such as relational databases, XML, JSON, CSV, text, sensors/activity trackers, clickstreams, social media, and geolocation
Effective communication skills including written, oral, listening and interpersonal.
Minimum Education:
Master’s degree in health care, management, mathematics, engineering, informatics, or a related field
Minimum Experience:
3 years of enterprise experience as a data scientist, informaticist, statistician, senior analyst or related role, with demonstrated experience across multiple concurrent projects.
Experience with or strong interest in AI/ML methods and their application to research and analytic problems preferred.
Familiarity with cloud-based platforms, particularly AWS, or demonstrated aptitude and willingness to develop that familiarity preferred.
Physical Requirements:
Exerts up to 50 pounds of force occasionally and/or up to 20 pounds frequently and/or a negligible amount constantly to move objects.
Security Sensitive:
This position is a security-sensitive position pursuant to Texas Education Code §51.215 and Texas Government Code §411.094. To the extent that a position requires the holder to research, work on, or have access to critical infrastructure as defined in Texas Business and Commerce Code §117.001(2), the ability to maintain the security or integrity of the infrastructure is a minimum qualification to be hired for and to continue to be employed in that position. Personnel in such positions, and similarly situated state contractors, will be routinely reviewed to determine whether things such as criminal history or continuous connections to the government or political apparatus of a foreign adversary might prevent the applicant, employee, or contractor from being able to maintain the security or integrity of the infrastructure. A foreign adversary is a nation listed in 15 C.F.R. §791.4.
Residency Requirement:
Employees must permanently reside and work in the State of Texas.