Automated Analytics Platform for a Company Serving Multiple Healthcare Providers
About Our Client
The Client is a North American analytics company.
Need for an Automated Data Analytics Platform
The Client provides analytics services to multiple healthcare providers, including preparation of providers’ diverse data for BI querying and reporting. In the past, the Client had to develop dedicated data processing pipelines for each new provider. The company decided to build an analytics platform that would enable automated data preparation and delivery according to each provider’s unique rules.
Building a Common Model for Disparate Data
ScienceSoft appointed a data analytics team consisting of a project manager, a business analyst, a solution architect, data engineers, and a DevOps engineer. To ensure HIPAA compliance, ScienceSoft signed a Business Associate Agreement (BAA) with the Client before our team gained access to any protected information.
The Client’s IT team provided us with a high-level idea of the common data model that would be a basis for a centralized data management framework. The model was to accommodate various data like patient care and treatment history, insurance-related data, and more in different file formats (e.g., CSV, JSON, XML, ZIP).
ScienceSoft analyzed diverse data samples to identify the standard data fields and design universal data transformation rules for all healthcare providers. To optimize the process, our team worked with a limited number of files that reflected the diversity of the available formats and data structures. During data model creation, ScienceSoft had regular meetings with the Client to report on the progress, get feedback from the stakeholders, and adjust the data model accordingly.
Healthcare Analytics Platform Development
The analytics platform supports the following elements:
Data ingestion from the Client’s databases and data lakes.
Raw data storage. All files land in the raw data storage in their initial format and are enriched with metadata (e.g., file owner, upload date).
Data transformation. The data passes through the ETL and ELT pipelines and lands in the staging area. During the transformation process, the data is validated, deduplicated, cross-checked, and otherwise adjusted to fit the common data model.
Data aggregation in a data warehouse storing highly structured, cleaned data.
Data integration. An external integration source storing data objects prepared according to each provider’s requirements. The Client uses BI tools to query the data integration layer, prepare analytics reports, and deliver them to its customers.
Event recording. The platform gathers and stores information about all the performed actions (e.g., data entry, transformation, delivery).
HIPAA-compliant data governance. The solution uses an identity and access management service to enable user authentication, authorization, and data access control for the Client’s staff.
After the platform was ready, ScienceSoft held webinars and workshops with the Client’s team to enable smooth user adoption.
Automated Healthcare Analytics Platform With a Unified Data Model for Multiple Healthcare Providers
The Client received a scalable and HIPAA-compliant data analytics platform that can be used to prepare BI reports for multiple healthcare providers. Since data extraction, cleansing, and warehousing are now automated and performed following a unified data model, the Client can significantly reduce the time spent preparing unique reports for each of its customers.
The Client also received exhaustive software documentation, including a data governance framework, a data strategy roadmap, and platform user guidelines.
Technologies and Tools
Azure.