A. RATIONALE
This assignment aims to evaluate students’ proficiency in creating a database system and designing a model for comprehensive data management. The objectives for this module include:
CLO1: Creating database systems using appropriate design techniques and scripting languages (C6, PLO1, MQF1)
CLO2: Designing models for storage, cleansing, transforming, and integrating data to meet organizational standards (P7, PLO2, MQF2)
B. ASSIGNMENT/PROJECT EVALUATION
The following criteria will be utilized to assess your assignment/project submission:
C. REQUIREMENTS
You are tasked with creating a comprehensive data extraction, transformation, and loading (ETL) project based on a real dataset. You have the freedom to choose any dataset as long as it is a freely shared valid dataset generated by a legitimate organization or company.
The ETL project will involve data extraction, transformation, and loading processes along with supporting activities, as outlined in the tasks below.
Extract, Transform, Load (ETL) Case Study
You are assigned to an ETL project that involves processing a dataset in ETL and analytics activities. The project begins with web scraping activity to extract public data available on a website, such as weather data or Covid-19 pandemic-related information. For web scraping, you may utilize automation tools like Python-based Beautiful Soup. The tasks are divided into the following sections.
Accomplishing these tasks successfully will lead to the completion of the project. You may use the Python programming language for the tasks, utilizing an Integrated Development Environment (IDE) tool such as Jupyter Notebook. At the end of the extraction, ensure the dataset has more than sufficient attributes (approximately 10 attributes) including various data types like character, integer, number/float, date, etc. The dataset may also contain missing values and outliers.
Task 1: Data Extraction and Understanding the Dataset
Select a source website for web scraping and justify your choice. After web scraping, list down all available column names and provide explanations.
Task 2: Identifying the Variables/Columns
Analyze the dataset and recommend its use for decision-making purposes. Explain the relevance of each variable in the dataset and its suitability as an independent or dependent variable for the proposed model.
Task 3: Data Categorization and Data Profiling
Categorize columns into independent and dependent variables. Explain data profiling and its components. Evaluate the dataset with data profiling components.
Task 4: Data Transformation
Decide on data transformation techniques such as eliminating duplicate rows, handling missing values, converting categorical columns to numerical ones, detecting and handling outliers, and standardizing the data. Include justifications and screenshots.
Task 5: Analysis (Factor analysis / Classifications)
Describe the data and its correlations, conduct factor analysis or classifications, explain the relevance of discovered factors, and visualize them. Drop unnecessary variables.
Task 6: Data Loading
Load the transformed data into a local SQL database (e.g., MySQL, PostgreSQL) and provide a table specification using the specified template.
Compare and contrast the concepts of business buying behaviour and consumer buying behaviour.
Read MoreBusiness and Economics Develop four customer satisfaction strategies, list three advantages and disadvantages of each, and make a final recommendation using change management knowledge from previous assignments.
Read MoreCreate a Story using a Group of Prompt Words-Choose one of the following groups of words in which to create a story.
Read MoreHow might your journey to start/stop have some similarities to a client struggling to attain sobriety?
Read MoreIdentify the ratio of affirming statements to constructive statements correctly for four out of five scenarios.
Read MoreWhat information would you need to help you assess the adequacy of the protection of human rights?
Read MoreExplain why it is so important to represent the organizations interests see exhibit 1.8 What are some possible consequences of not doing so?
Read MoreDescribe how she honored her values in approaching the employee behavior she witnessed.Was it appropriate of Thorn to report the incident to HR?
Read MoreOTHM Level 5 Partnership Working Skills in Health and Social Care Assignment Brief
Read MoreHow effective are Educational Interventions in increasing awareness and knowledge about prostate Cancer amongst Black men Dissertation,
Read More