Portrait of Thiago Baptista
Vitória, Brazil
Hello, I'm

Thiago Baptista

Engineer. Builder. Problem Solver.

Senior Data & Analytics Engineer specializing in Data Lakehouse and Big Data solutions. Building end-to-end analytical solutions with Databricks, Apache Spark, Delta Lake, and Microsoft Fabric.

About Me

I’m Thiago Baptista, a Senior Data & Analytics Engineer with over two decades of experience turning raw data into strategic assets. My journey started with relational databases and BI in 2000 and evolved into modern Data Lakehouse architectures with Databricks, Apache Spark, and Microsoft Fabric. I design and build end-to-end analytical solutions: from ingestion pipelines to Gold-layer data products that power executive dashboards and drive business decisions.

Quick Facts

Current Role

Senior Analytics Engineer at Natura (via HVAR)

Education

Federal University of Espírito Santo (UFES), B.Sc. Computer Science

Certifications

  • Databricks Certified Data Engineer Associate
  • Microsoft Certified: Fabric Data Engineer Associate
  • Microsoft Certified: Azure Data Fundamentals
  • GitHub Foundations

Languages

Portuguese (Native) English (C1 Advanced)

Skills

End-to-end data engineering and analytics, from Data Lakehouse design to governed data products that drive business decisions.

Lakehouse Analytics

Data Lakehouse with Databricks and Microsoft Fabric. Medallion architecture, Delta Lake, dimensional modeling, Silver-to-Gold transformations with PySpark and Spark SQL, business metric definition, and data delivery for Tableau and Power BI.

Pipelines & CI/CD

Scalable data pipelines with Apache Spark and Delta Lake. CI/CD automation with Azure DevOps and GitHub Actions for deployment and versioning of data artifacts.

Governance & Quality

Unity Catalog and Microsoft Purview. Data cataloging, lineage tracking, access control, metric dictionaries. Data quality with Delta Live Tables expectations, unit testing, and data quality checks.

Leadership & Industry

Leading multidisciplinary teams, defining architecture standards, and translating business requirements into data solutions. Sectors: retail, industry, agribusiness, cosmetics, logistics, and government.

Experience

From relational databases to modern Data Lakehouse, 30 years building data solutions that drive strategic decisions.

Jan 2026 - Present

Natura (via HVAR)

Senior Analytics Engineer

Cosmetics & Consumer Goods São Paulo, Brazil

Latin America's largest cosmetics company, operating across multiple countries and brands

Key Achievements
  • Design and build Gold-layer data products in Databricks feeding executive Tableau dashboards
  • Silver-to-Gold transformations with PySpark and Spark SQL: OBT, Star Schema, SCD patterns
  • Multi-brand, multi-country financial and commercial KPIs: revenue, pricing, approval rates, customer lifecycle
  • Business metric definition and documentation aligned with Natura’s Data Hub standards and Unity Catalog
  • DataViz enablement: query packages with field-level documentation for Tableau developers
Sep 2025 - Jan 2026

Natura (via HVAR)

Senior Data Engineer

Cosmetics & Consumer Goods São Paulo, Brazil

Data modernization and SAP-Databricks integration project

Key Achievements
  • Technical lead for SAP data ingestion (ECC, BW) into Databricks via SAP BDC and AecorSoft connectors
  • Bronze/silver/gold layer design and evolution based on Delta Lake
  • Spark/PySpark pipeline optimization with advanced partitioning and data versioning
  • CI/CD pipelines for notebook and workflow deployment using GitHub
  • Technical standards: naming conventions, table versioning, Unity Catalog documentation
Dec 2024 - Sep 2025

Prodesp (via AlmavivA)

Senior Data Engineer

Government São Paulo, Brazil

GDAP - Digital Cabinet for Public Administration, serving the Governor of São Paulo State

Key Achievements
  • Scalable data pipelines with Databricks and Spark for state-wide data integration
  • Data Lakehouse architecture to unify and democratize access to government data
  • CI/CD with Azure DevOps: fully automated pipelines for ingestion, transformation, and deployment
  • Power BI semantic models and dashboards for strategic government consumption
Mar 2024 - Dec 2024

Prodesp (via AlmavivA)

Senior Database Administrator

Government São Paulo, Brazil

Supporting the São Paulo State Secretary of Economic Development

Key Achievements
  • Advanced SQL Server: views, CTEs, stored procedures, T-SQL scripts for government programs
  • Production environment support with proactive monitoring and SLA-based incident management
  • Python automation scripts for data extraction, analysis, and integration
Apr 2014 - Mar 2024

Elever Vision

Senior Data Engineer

Consulting (Multi-sector) Vitória, Brazil

Data engineering consulting across retail, industry, media, and entertainment

Key Achievements
  • Data Lakes and Data Warehouses with Azure Storage, Azure SQL Database, SQL Server, PostgreSQL
  • ETL pipeline orchestration with Azure Data Factory, Apache Airflow, and SSIS
  • CI/CD pipelines with Azure DevOps and GitHub for data lifecycle governance
  • Power BI dashboards and Python (pandas) automation for large-scale data processing
Mar 2010 - Jan 2014

Vale / VLI

Data Engineer

Mining & Logistics Belo Horizonte, Brazil

BI and Data Warehousing for railway and port logistics operations

Key Achievements
  • Technical lead for a 15-person team building Data Warehouse for FP&A processes
  • Multidimensional modeling in SQL Server: fact/dimension tables, complex T-SQL
  • ETL orchestration with SSIS integrating SAP, Oracle ERP, Cognos, and operational systems
  • Financial analysis solutions for demand forecasting and cost allocation
Feb 2000 - Mar 2010

Fundação Ceciliano Abel de Almeida

Data Engineer

Education / Non-profit Vitória, Brazil

Non-profit affiliated with the Federal University of Espírito Santo (UFES)

Key Achievements
  • BI layer over ERP Sapiens using SQL Server for advanced analytics and management reporting
  • Database administration: capacity planning, backup, disaster recovery, high availability
  • ETL automation with Visual Basic and Crystal Reports dashboards

Tech Stack

Core technologies spanning data platforms, programming, orchestration, and visualization.

Big Data & Processing

Apache Spark
Apache Spark
Delta Lake
Delta Lake

Data Platforms

Databricks
Databricks
Microsoft Fabric
Microsoft Fabric

Programming & Query Languages

Python
Python
PySpark
PySpark
Spark SQL
T-SQL
T-SQL

Data Governance

Unity Catalog
Unity Catalog
Microsoft Purview
Microsoft Purview

Databases

PostgreSQL
PostgreSQL
SQL Server
SQL Server
Azure SQL Database
Azure SQL Database
DynamoDB
DynamoDB

Visualization

Power BI
Power BI
Tableau
Tableau
Grafana
Grafana

Cloud & Storage

Azure
Azure
OneLake
OneLake
ADLS Gen2
ADLS Gen2
AWS
AWS
Amazon S3
Amazon S3

CI/CD & DevOps

Git
Git
GitHub
GitHub
Azure DevOps
Azure DevOps
GitHub Actions
GitHub Actions

Project Management

Azure Boards
Azure Boards
GitHub Projects
GitHub Projects
Jira
Jira
Kanban
Scrum

Let's Connect

Send me a message

Contact Information

Interested in discussing data architecture, lakehouse solutions, or collaboration opportunities? Let's connect.