일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 | 31 |
- hash
- binary search tree
- BST
- data
- Study
- dataStructure
- Data Engineer
- 데이터 분석가
- HEAPS
- Computer Science
- Computer Organization
- 빅데이터
- Newyork
- Data Structure
- 데이터 엔지니어
- Data Analyst
- 빅데이터 지식
- exam
- 뉴욕 화장실
- Binary Tree
- algorithm
- Restroom
- priority queue
- data scientist
- Heap
- Algorithms
- 화장실 지도
- 빅데이터 커리어 가이드북
- Preparing for the Google Cloud Professional Data Engineer Exam
- Linked List
- Today
- Total
Jaegool_'s log
Coursera IBM Data Science Course: What is Data Science 본문
Week 1: What Do Data Scientists Do?
Data science is the field of exploring, manipulating, and analyzing data, and using data to answer questions or make recommendations.
Summary
- Data science is the study of large quantities of data, which can reveal insights that help organizations make strategic choices.
- There are many paths to a career in data science; most, but not all, involve math, programming, and curiosity about data.
- New data scientists need to be curious, judgemental, and argumentative.
- Knowledgeable data scientists are in high demand. Jobs in data science pay high salaries for skilled workers.
- The typical work day for a Data Scientist varies depending on what type of project they are working on.
- Many algorithms are used to bring out insights from data.
- Some key data science-related terms you learned in this lesson: outliers, models, algorithms, JSON, XML, CSV, and regression.
Week 2: Big Data and Data Mining, Deep Learning and Machine Learning
note
1. Apache Hadoop: Distributed storage and processing
2. Apache Hive: Provides large data set management
3. Apache Spark: Data processing engine
Summary:
- Big Data has five characteristics: velocity, volume, integrity, and value.
- The five cloud computing characteristics are scalability, collaboration, accessibility, and software maintenance.
- Data mining has a six-step process: goal setting, selecting data sources, preprocessing, transforming, mining, and evaluation.
- The availability of so many disparate amounts of data created by people, tools, and machines requires new, innovative, and scalable technology to drive transformation.
- Deep learning utilizes neural networks to teach itself patterns in inputs and outputs. Machine learning is a subset of AI that uses computer algorithms to learn about data and make predictions without explicitly programming the analysis methods into the system.
- Regression identifies the strength and amount of the correlation between one or more inputs and an output.
- Skills involved in processing Big Data include the application of statistics, machine learning models, and some computer programming.
- Generative AI, a subset of artificial intelligence, focuses on producing new data rather than just analyzing existing data. It allows machines to create content, including images, music, language, computer code, and more, mimicking creations by people.
Finding Optimal Locations for New Stores
IBM Cloud Pak for Data
This notebook shows you how Decision Optimization can help to prescribe decisions for a complex constrained problem using CPLEX Modeling for Python to help determine the optimal location for a new store. This notebook requires the Commercial Edition of CPL
dataplatform.cloud.ibm.com
Week 3: Data Science Application Domains & Careers and Recruiting in Data Science
Key points:
1. Diverse Backgrounds of Data Scientists
2. Companies' Ideal Expectations
3. Realistic Hiring Approach
4. Importance of Passion and Curiosity
5. Essential Skills for Data Scientists
6. Communication and Storytelling
7. Creating an Effective Data Science Team
Report structure
- Cover Page: Often overlooked, it should include the report's title, author names, affiliations, contact details, the publisher's name, and the publication date.
- Table of Contents (ToC): Essential for longer documents, it provides an overview of the report's structure.
- Abstract/Executive Summary: Vital even for short reports, summarizing the main arguments in a concise manner.
- Introductory Section: Introduces the topic and sets the stage for the reader, often followed by a literature review which outlines existing research and identifies knowledge gaps.
- Methodology Section: Describes research methods and data sources, especially important if new data is collected.
- Results Section: Presents empirical findings using various methods like descriptive statistics, graphics, regression models, and data mining.
- Discussion Section: Builds on the results to craft the main arguments and relates findings back to the research questions and identified knowledge gaps.
- Conclusion Section: Generalizes findings, addresses potential future research developments, and often adopts a marketing approach to emphasize the study's contributions.
+ references, acknowledgments, and appendices
Summary:
- Data Science helps physicians provide the best treatment for their patients , helps meteorologists predict the extent of local weather events, and can even help predict natural disasters like earthquakes and tornadoes.
- Companies can start on their data science journey by capturing data. Once they have data, they can begin analyzing it.
- Everyone who uses the Internet generates mass amounts of data daily.
- Amazon and Netflix use recommendation engines, and UPS uses data from customers, drivers, and vehicles to use the drivers’ time and fuel efficiently.
- The purpose of the final deliverable of a Data Science project is to communicate new information and insights from the data analysis to key decision-makers.
- The report should present a thorough analysis of the data and communicate the project findings.
- Companies should look for someone excited about working with the data in their particular industry. They should seek out someone curious who can ask interesting, meaningful questions about the types of data they intend to collect. They should hire people who love working with data, are fluent in statistics, and are competent in applying machine learning algorithms.
- A clearly organized and logical report should communicate the following to the reader:
- What they gain by reading the report
- Clearly defined goals
- The significance of your contribution
- Appropriate context by giving sufficient background
- Why this work is practical and useful
- Conjecture plausible future developments that might result from your work
'Data Science' 카테고리의 다른 글
Coursera Data Science Courses: Open datasets and sources (0) | 2024.04.16 |
---|---|
Coursera IBM Data Science Course: Tools for Data Science (0) | 2024.02.22 |
Google Cloud Platform Certification, Associate Cloud engineer 공부 #day5 (0) | 2022.07.18 |
Google Cloud Platform Certification, Associate Cloud engineer 공부 #studylist (0) | 2022.07.18 |
release vs deploy vs distribute (0) | 2022.07.17 |