2022 to 2024
Worked @ Australian Government
Architected and engineered a Google Cloud Platform organization and Azure Entra / DevOps organization
For both the migration of data from a Hadoop platform (Nifi and Impala/Hive) to Google Cloud Platform and the ingestion of public data to BigQuery
Designing and developing a mono repo (using Python poetry) for managing Google Cloud Functions / Workflows / Apache Beam / Dataflow / Dataform pipelines in Python / JavaScript
Building custom Docker images for Dataflow and Flask web applications
Deploying to GCP using Terraform and Bash scripts with the Gloud SDK as scheduled Dataflow runs
Analysing source systems (Teradata and Microsoft SQL Server) and reverse engineering requirements from the Nifi pipelines due to missing corporate knowledge and documentation
Calling an LLM (Google Gemini) through Vertex AI using BigQuery ML from Dataform
Management of project tasks and deliverable in Jira and Azure DevOps
Full stack machine learning architecture as well as web development
Development and architecture of a web application for integration of Natural Language Processing (NLP) predictions and labelling with business reporting
Python Flask, Alembic, SQLite3, AdminLTE (built on top of Bootstrap) with JavaScript libraries including DropZone.js
Use of Git (and Bitbucket)
Configuration of Linux environment
Maintaining and developing R scripts that read and write from many data sources and targets
Sourcing data from multiple data warehouses (Teradata and Oracle) into R environment