Notice

Recent Posts

Recent Comments

Link

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Tags more

Archives

Today

Total

관리 메뉴

Jaegool_'s log

Coursera IBM Data Science Course: What is Data Science 본문

Data Science

Coursera IBM Data Science Course: What is Data Science

Jaegool 2024. 1. 23. 01:50

Week 1: What Do Data Scientists Do?

Data science is the field of exploring, manipulating, and analyzing data, and using data to answer questions or make recommendations.

Summary

- Data science is the study of large quantities of data, which can reveal insights that help organizations make strategic choices.

- There are many paths to a career in data science; most, but not all, involve math, programming, and curiosity about data.

- New data scientists need to be curious, judgemental, and argumentative.

- Knowledgeable data scientists are in high demand. Jobs in data science pay high salaries for skilled workers.

- The typical work day for a Data Scientist varies depending on what type of project they are working on.

- Many algorithms are used to bring out insights from data.

- Some key data science-related terms you learned in this lesson: outliers, models, algorithms, JSON, XML, CSV, and regression.

Week 2: Big Data and Data Mining, Deep Learning and Machine Learning

note

1. Apache Hadoop: Distributed storage and processing

2. Apache Hive: Provides large data set management

3. Apache Spark: Data processing engine

Summary:

Big Data has five characteristics: velocity, volume, integrity, and value.
The five cloud computing characteristics are scalability, collaboration, accessibility, and software maintenance.
Data mining has a six-step process: goal setting, selecting data sources, preprocessing, transforming, mining, and evaluation. 
The availability of so many disparate amounts of data created by people, tools, and machines requires new, innovative, and scalable technology to drive transformation.
Deep learning utilizes neural networks to teach itself patterns in inputs and outputs. Machine learning is a subset of AI that uses computer algorithms to learn about data and make predictions without explicitly programming the analysis methods into the system.
Regression identifies the strength and amount of the correlation between one or more inputs and an output.
Skills involved in processing Big Data include the application of statistics, machine learning models, and some computer programming.
Generative AI, a subset of artificial intelligence, focuses on producing new data rather than just analyzing existing data. It allows machines to create content, including images, music, language, computer code, and more, mimicking creations by people.

Finding Optimal Locations for New Stores

https://dataplatform.cloud.ibm.com/exchange/public/entry/view/aceccfd155454fd9741852e12e9cce4e?context=cpdaas

IBM Cloud Pak for Data

This notebook shows you how Decision Optimization can help to prescribe decisions for a complex constrained problem using CPLEX Modeling for Python to help determine the optimal location for a new store. This notebook requires the Commercial Edition of CPL

dataplatform.cloud.ibm.com

Week 3: Data Science Application Domains & Careers and Recruiting in Data Science

Key points:

1. Diverse Backgrounds of Data Scientists

2. Companies' Ideal Expectations

3. Realistic Hiring Approach

4. Importance of Passion and Curiosity

5. Essential Skills for Data Scientists

6. Communication and Storytelling

7. Creating an Effective Data Science Team

Report structure

Cover Page: Often overlooked, it should include the report's title, author names, affiliations, contact details, the publisher's name, and the publication date.
Table of Contents (ToC): Essential for longer documents, it provides an overview of the report's structure.
Abstract/Executive Summary: Vital even for short reports, summarizing the main arguments in a concise manner.
Introductory Section: Introduces the topic and sets the stage for the reader, often followed by a literature review which outlines existing research and identifies knowledge gaps.
Methodology Section: Describes research methods and data sources, especially important if new data is collected.
Results Section: Presents empirical findings using various methods like descriptive statistics, graphics, regression models, and data mining.
Discussion Section: Builds on the results to craft the main arguments and relates findings back to the research questions and identified knowledge gaps.
Conclusion Section: Generalizes findings, addresses potential future research developments, and often adopts a marketing approach to emphasize the study's contributions.

+ references, acknowledgments, and appendices

Summary:

Data Science helps physicians provide the best treatment for their patients , helps meteorologists predict the extent of local weather events, and can even help predict natural disasters like earthquakes and tornadoes.
Companies can start on their data science journey by capturing data. Once they have data, they can begin analyzing it.
Everyone who uses the Internet generates mass amounts of data daily.
Amazon and Netflix use recommendation engines, and UPS uses data from customers, drivers, and vehicles to use the drivers’ time and fuel efficiently.
The purpose of the final deliverable of a Data Science project is to communicate new information and insights from the data analysis to key decision-makers.
The report should present a thorough analysis of the data and communicate the project findings.
Companies should look for someone excited about working with the data in their particular industry. They should seek out someone curious who can ask interesting, meaningful questions about the types of data they intend to collect. They should hire people who love working with data, are fluent in statistics, and are competent in applying machine learning algorithms.
A clearly organized and logical report should communicate the following to the reader:
- What they gain by reading the report
- Clearly defined goals
- The significance of your contribution
- Appropriate context by giving sufficient background
- Why this work is practical and useful
- Conjecture plausible future developments that might result from your work

'Data Science' 카테고리의 다른 글

Coursera Data Science Courses: Open datasets and sources (0)	2024.04.16
Coursera IBM Data Science Course: Tools for Data Science (0)	2024.02.22
Google Cloud Platform Certification, Associate Cloud engineer 공부 #day5 (0)	2022.07.18
Google Cloud Platform Certification, Associate Cloud engineer 공부 #studylist (0)	2022.07.18
release vs deploy vs distribute (0)	2022.07.17

'Data Science' Related Articles

Jaegool_'s log

Coursera IBM Data Science Course: What is Data Science 본문

Coursera IBM Data Science Course: What is Data Science

Finding Optimal Locations for New Stores

'Data Science' 카테고리의 다른 글

티스토리툴바