无辜的Musanzikwa,加拿大卡尔加里的开发者
Innocent is available for hire
Hire Innocent

无辜的Musanzikwa

验证专家  in Engineering

数据工程师和开发人员

Location
卡尔加里,AB,加拿大
至今成员总数
2021年8月10日

Inno is a seasoned data engineer and developer who's worked at IRI—a top retail data analytics company—in Africa and North America for the past decade and as a freelance consultant for the past couple of years. 作为SQL和ETL开发人员, he has created quality data warehouses using industry-standard techniques like Kimball and DataVaults. 作为数据工程师, Inno has built highly robust and scalable data pipelines both on-premise and on the cloud using several latest cutting-edge technologies.

Portfolio

Darwill, Inc.
SQL, Tableau, Python, 工程数据, 数据分析, ETL, 数据仓库...
SFL科学有限公司
SQL, SQL Server集成服务(SSIS), MariaDB, Microsoft SQL Server...
航空控股有限责任公司
商业智能(BI), SQL, APIs, SQL Server DBA, 维度建模...

Experience

Availability

Part-time

首选的环境

SQL, PySpark, Python, Hadoop, Apache Hive, Azure突触, Oracle, SQL Server集成服务(SSIS), Azure数据工厂, 数据仓库

最神奇的...

...big data warehousing and data integration solution I've designed—using Python, SQL, ADF, Hadoop, Hive, spark从六家竞争对手中赢得了加拿大的RFP.

工作经验

数据工程师

2022 - 2022
Darwill, Inc.
  • Built Tableau dashboards and visualizations using AWS Redshift and Aurora databases.
  • Created AWS Lambda functions running Python for custom ETL tasks and ad-hoc requests.
  • Managed AWS Redshift and Aurora databases and designed data warehouses and data migrations.
  • Redesigned the client's data warehouse using the AWS tech stack and improved their migration process by introducing federated queries and Lambda functions running Python pipelines, 以及彻底改造他们的Tableau仪表板.
技术:SQL, Tableau, Python, 工程数据, 数据分析, ETL, 数据仓库, 亚马逊网络服务(AWS), 关系数据库, 数据清理, Data Science, Databases, PostgreSQL, AWS Lambda, 数据库开发, 数据可视化, 专用SQL池(以前称为SQL DW), Azure SQL数据仓库, 数据库建模, MySQL, 实体关系, 业务分析, 数据库设计

数据工程师

2022 - 2022
SFL科学有限公司
  • Consulted on an existing SSIS poorly designed data integration project and helped identify bottlenecks and inefficiencies.
  • Redesigned the existing data pipeline using SSIS to be efficient and scalable.
  • 执行SQL调优和SQL代码审查以提高流程效率.
技术:SQL, SQL Server集成服务(SSIS), MariaDB, Microsoft SQL Server, 数据转换, Python, 数据库模式设计, iPaaS, CI / CD管道, 关系数据库, 存储过程, 数据分析, t - sql (transact - sql), SQL DML, 数据库开发, 数据分析, 数据可视化, 专用SQL池(以前称为SQL DW), Azure SQL数据仓库, 数据库建模, 实体关系, Tableau, 业务分析, 数据库设计

BI和数据仓库专家

2021 - 2022
航空控股有限责任公司
  • Designed and developed data pipelines to integrate data from Quickbooks API, Sage完整API, 和电子表格转换成Azure SQL.
  • 在Azure SQL中设计并开发了一个数据仓库.
  • Designed and created business reports and KPI dashboards using Power BI.
  • Developed complex SQL scripts to manage data transformations and speed up integration.
技术:商业智能(BI), SQL, APIs, SQL Server DBA, 维度建模, 关系数据库, Microsoft Power BI, Cloud, Git, REST APIs, Synapse, DAX, 仪表盘的设计, Dashboards, 存储过程, Tableau, 数据分析, t - sql (transact - sql), SQL DML, 数据库开发, 数据分析, Microsoft Power automation, 数据可视化, 数据库建模, 实体关系, 业务分析, 数据库设计

迁移项目的数据分析师

2021 - 2021
JLL - JLLT数据
  • Developed the data pipeline to integrate data from Salesforce to Microsoft SQL.
  • 设计高级SQL代码.g., CTE, stored procedures, and functions to manage data transformations.
  • Performed SQL tuning to improve ETL efficiencies and process scalability.
  • 咨询标准操作程序和最佳情况.
技术:SQL, t - sql (transact - sql), ETL, Salesforce, 数据迁移, 关系数据库, Microsoft Power BI, SQL Server报表服务(SSRS), 存储过程, 数据分析, 谷歌表, SQL DML, 数据库开发, 数据分析, 数据库建模, 实体关系, Tableau, 业务分析, 数据库设计

总监|数据工程

2019 - 2021
IRI
  • Developed Azure数据工厂 pipelines to integrate data from Apache Hive, HDFS, OAuth 2 APIs, 和各种平面文件类型转换为Azure SQL.
  • 管理陆上和海上大数据开发团队, 在Jira上分配任务并跟踪进度.
  • Oversaw data strategy and recommendations for new data sources and ongoing projects.
  • 指导大数据工程师,帮助他们提高技能.
  • Architected new data models and upgraded old data warehouses as per client request or technology change.
技术:Python, Apache Hive, Hadoop, Azure突触, Azure数据工厂, Bash Script, SQL, Azure SQL, Databricks, 工程数据, ETL, 数据建模, Databases, Azure, Data, 数据架构, 商业智能(BI), 数据管道, Apache气流, 数据集成, Big Data, t - sql (transact - sql), 数据迁移, Snowflake, 数据构建工具(dbt), Apache Kafka, ELT, SQL Server集成服务(SSIS), 数据转换, 维度建模, 关系数据库, Microsoft Power BI, Cloud, SQL DML, 数据库开发, 专用SQL池(以前称为SQL DW), Azure SQL数据仓库, 数据库建模, 实体关系, 数据库设计

ETL Architect

2016 - 2019
IRI
  • 在本地和云端开发基于sql的数据仓库.
  • Integrated various data sources from flat files to cloud-based data sources like Snowflake, 将AWS和数据湖整合到Azure数据仓库, 以及Hadoop上的Apache Hive.
  • Created scalable data pipelines and improved efficiencies on the existing ones.
  • Trained and upskilled new data developers and participated in code reviews.
  • Maintained system documentation of all business data components and strategies.
技术:SQL Server集成服务(SSIS), Azure突触, Azure数据工厂, Databricks, PySpark, SQL, Oracle, Apache Hive, Hadoop, 数据仓库设计, 工程数据, ETL, 数据建模, SQL存储过程, Databases, Data, 数据架构, 商业智能(BI), 数据管道, 数据集成, Big Data, BigQuery, JavaScript, t - sql (transact - sql), 数据迁移, Snowflake, 亚马逊网络服务(AWS), Amazon Elastic MapReduce (EMR), ELT, APIs, 数据转换, MariaDB, SQL Server DBA, 维度建模, 关系数据库, Microsoft Power BI, Cloud, REST APIs, SQL DML, 数据库开发, 专用SQL池(以前称为SQL DW), Azure SQL数据仓库, 数据库建模, 实体关系, 性能调优, Dynamic SQL

SQL首席开发人员

2012 - 2016
IRI
  • 开发了基于sql的数据仓库和数据集市.
  • 编写SQL查询,为SSRS报告提供数据.
  • Used SSIS, Talend, and DataStage for ETL processes depending on the client's requirements.
  • Created custom business reports using SQL Server报表服务(SSRS).
  • 管理初级开发人员并主持独立开发会议.
技术:SQL, SQL Server集成服务(SSIS), SQL Server报表服务(SSRS), PSQL, MySQL, 数据仓库, 工程数据, ETL, 数据建模, SQL存储过程, Databases, Data, 数据架构, 商业智能(BI), 数据管道, 数据集成, Big Data, t - sql (transact - sql), 数据迁移, ELT, 数据转换, 维度建模, 关系数据库, Microsoft Power BI, REST APIs, SSAS, 仪表盘的设计, Dashboards, SQL DML, 数据库开发, SSRS Reports, 专用SQL池(以前称为SQL DW), Azure SQL数据仓库, 数据库建模, SQL Server 2015, 实体关系, 业务分析, 性能调优, Dynamic SQL

SQL/ETL开发和顾问

2010 - 2012
Mi9零售(原JustEnough软件公司)
  • 管理移动设备和SQL Server之间的SQL复制.
  • Created SQL data warehouses using the Kimball methodology for reporting purposes.
  • Designed and developed ETL packages using SQL Server集成服务(SSIS).
  • Designed and developed reports in SQL Server报表服务(SSRS).
  • Performed database tuning and code reviews for any code being deployed to production.
技术:SQL, SQL Server集成服务(SSIS), SQL Server报表服务(SSRS), Microsoft SQL Server, 工程数据, ETL, 数据建模, SQL存储过程, Databases, Data, 数据架构, 商业智能(BI), 数据管道, 数据集成, Big Data, t - sql (transact - sql), 数据迁移, 数据转换, 关系数据库, Microsoft Power BI, SSAS, SQL DML, 数据库开发, SSRS Reports, 数据库建模, SQL Server 2015, 实体关系

从Azure SQL到Snowflake的数据迁移

http://github.com/innowarue/ADF
This project involved migrating data from an Azure SQL database to a Snowflake data warehouse using an Azure数据工厂 data pipeline. It took me minutes to create it based on my skill set and proficiency in Data Factory.

I replaced the authentic data sources with my Azure and Snowflake accounts to make the project publicly available without compromising confidentiality.

来自OAuth2 API的数据集成

I created an automated data pipeline to integrate data accessible via an OAuth2-based API in JSON format into a cloud-based data warehouse solution. The solution used Python and Spark on Databricks integrated into an Azure数据工厂 pipeline.

SQL Server复制到移动设备

I created a replication system that synced data between mobile devices and Microsoft SQL Server. 现场销售代表将从现场收集信息, upload it to SQL Server using SQL CE and download any updates from SQL Server via the mobile replication I set up.

就地数据集成的收购

I created an in-place ETL integration for a company acquisition and merger, bringing the two companies' data into a single warehouse while continuously delivering weekly reports to the client services and retail service teams.

Kafka流和数据集成

I created an automated data pipeline to integrate data accessible via a Kafka stream, ingesting it into Spark Streaming using Spark and Python and loading it into a Cloudera Hadoop file system accessible using a Hive data warehouse solution.

Languages

SQL, Python, Bash Script, t - sql (transact - sql), Snowflake, 存储过程, SQL DML, Scala, JavaScript, Bash

Frameworks

Hadoop, Spark, Windows PowerShell, ADF

库/ api

PySpark, REST api, Spark Streaming

Tools

Microsoft Power BI, Tableau, BigQuery, Synapse, SSAS, Apache气流, Amazon Elastic MapReduce (EMR), Git, 谷歌表

Paradigms

ETL, 商业智能(BI), 维度建模, 数据库开发, 数据库设计, Data Science

Platforms

亚马逊网络服务(AWS), AWS Lambda, Azure SQL数据仓库, 专用SQL池(以前称为SQL DW), Azure, Microsoft Power automation, Azure突触, Oracle, Databricks, Apache Kafka, Salesforce, Zeppelin

Storage

Apache Hive, MySQL, SQL Server集成服务(SSIS), SQL Server报表服务(SSRS), PSQL, Microsoft SQL Server, SQL存储过程, PostgreSQL, Databases, 数据管道, 数据集成, 关系数据库, 数据库体系结构, RDBMS, 数据库建模, Dynamic SQL, NoSQL, SQL Server DBA, 数据库复制, Azure SQL, MariaDB

Other

Azure数据工厂, 数据仓库, 数据分析, 工程数据, Data, 数据架构, Big Data, 数据迁移, ELT, 数据仓库设计, 数据转换, 数据库模式设计, ETL Tools, 脚本语言, 数据分析, 数据可视化, SSRS Reports, SQL Server 2015, 实体关系, 业务分析, 性能调优, 数据建模, Cloud, APIs, 仪表盘的设计, Dashboards, Web Scraping, 数据构建工具(dbt), iPaaS, CI / CD管道, DAX, 数据清理, Azure砖

2013 - 2015

信息技术学士学位

南非大学-比勒陀利亚,南非

2023年8月- 2025年8月

Databricks注册数据工程师助理

Databricks

2023年8月- 2025年8月

SnowPro Core

Snowflake

2020年12月- 2022年12月

认证Apache Spark和Hadoop开发人员

Cloudera

2019年12月至今

用Hive分析大数据

LinkedIn学习

2019年12月至今

数据科学高级NoSQL

LinkedIn学习

有效的合作

如何使用Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

分享你的需求

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

选择你的才能

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

开始你的无风险人才试验

与你选择的人才一起工作,试用最多两周. 只有当你决定雇佣他们时才付钱.

对顶尖人才的需求很大.

Start hiring