일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 | 31 |
- 빅데이터 지식
- Binary Tree
- BST
- 데이터 분석가
- Linked List
- binary search tree
- Study
- Data Structure
- HEAPS
- 화장실 지도
- data
- dataStructure
- hash
- Restroom
- algorithm
- Computer Science
- data scientist
- Data Analyst
- Heap
- priority queue
- exam
- Data Engineer
- Algorithms
- 데이터 엔지니어
- Computer Organization
- 빅데이터
- Preparing for the Google Cloud Professional Data Engineer Exam
- 뉴욕 화장실
- 빅데이터 커리어 가이드북
- Newyork
- Today
- Total
Jaegool_'s log
Coursera IBM Data Science Course: Tools for Data Science 본문
Summary
- The Data Science Task Categories include:
- Data Management - storage, management and retrieval of data
- Data Integration and Transformation - streamline data pipelines and automate data processing tasks
- Data Visualization - provide graphical representation of data and assist with communicating insights
- Modelling - enable Building, Deployment, Monitoring and Assessment of Data and Machine Learning models
- Data Science Tasks support the following:
- Code Asset Management - store & manage code, track changes and allow collaborative development
- Data Asset Management - organize and manage data, provide access control, and backup assets
- Development Environments - develop, test and deploy code
- Execution Environments - provide computational resources and run the code
The data science ecosystem consists of many open source and commercial options, and include both traditional desktop applications and server-based tools, as well as cloud-based services that can be accessed using web-browsers and mobile interfaces.
Data Management Tools: include Relational Databases, NoSQL Databases, and Big Data platforms:
- MySQL, and PostgreSQL are examples of Open Source Relational Database Management Systems (RDBMS), and IBM Db2 and SQL Server are examples of commercial RDBMSes and are also available as Cloud services.
- MongoDB and Apache Cassandra are examples of NoSQL databases.
- Apache Hadoop and Apache Spark are used for Big Data analytics.
Data Integration and Transformation Tools: include Apache Airflow and Apache Kafka.
Data Visualization Tools: include commercial offerings such as Cognos Analytics, Tableau and PowerBI and can be used for building dynamic and interactive dashboards.
Code Asset Management Tools: Git is an essential code asset management tool. GitHub is a popular web-based platform for storing and managing source code. Its features make it an ideal tool for collaborative software development, including version control, issue tracking, and project management.
Development Environments: Popular development environments for Data Science include Jupyter Notebooks and RStudio.
- Jupyter Notebooks provides an interactive environment for creating and sharing code, descriptive text, data visualizations, and other computational artifacts in a web-browser based interface.
- RStudio is an integrated development environment (IDE) designed specifically for working with the R programming language, which is a popular tool for statistical computing and data analysis.