我导师希望出版数据挖掘方面的 Julia 译著,目标是教学和推广。目前待选资料为链接中的 Books+Classes using Julia for teaching 部分。大家如果有推荐的请回复一下,谢谢!
为了能向出版学校或学院申请出版费,可能更希望没开源的已出版书籍。大家如果有想看的麻烦回复书名,或在已有书名下+1。我本人也会添加备选列表。一人可多票。我们偏向选票数最高的。
之后会去找原作者商讨版权事宜(不保证能做到。。)。如何对社区有贡献也请大家建议一下,谢谢!
待选资料
记得确认下 julia 的版本,能兼容 LTS v1.0.5 + 就行。
我觉得还是看内部需求来翻译吧。
比如说,假如你们有开线性代数课程的计划的化, 就可以考虑翻译Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares,这样在备课和翻译的过程能够互相提供帮助。
当然仅仅是个人观点:单纯只是翻译现有内容的话容易流于形式
How about Algorithms for Optimization?
Book template: GitHub - sisl/tufte_algorithms_book: A template for textbooks in the same style as Algorithms for Optimization
Youtube video introduction: https://www.youtube.com/watch?v=ofWy5kaZU3g&list=PLlHZu1B49BRZ5n7mw8x17HTqJQ7ar7kpG&index=6&t=0s
我的理解是你们想要偏 Statistics 和 Data Science 方面的。
楼上说的那个线性代数的教程,我感觉偏基础了。
你们再说的细一点,可能更方便大家推荐。比如要包括线性代数、数理统计的基础吗?要不要 julia 的快速入门教程?还是想偏向更高层一些的应用?
Books
-
[PDF] Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial Intelligence
草稿、未出版预计19年内出版。
这本更偏向统计基础,带julia基本入门、包含基础的概率统计、概率分布、数据可视化、统计推断、置信区间、假设检验、线性回归、后面两章还讲了基本的机器学习和动态概率模型。TOC
by Hayden Klok and Yoni Nazarathy. (DRAFT. PDF will be taken down when the book is published later in 2019).
Contents
- 1 Introducing Julia
- 1.1 Language Overview
- 1.2 Setup and Interface
- 1.3 Crash Course by Example
- 1.4 Plots, Images and Graphics
- 1.5 Random Numbers and Monte Carlo
- 1.6 Integration with Other Languages
- 2 Basic Probability
- 2.1 Random Experiments
- 2.2 Working With Sets
- 2.3 Independence
- 2.4 Conditional Probability
- 2.5 Bayes’ Rule
- 3 Probability Distributions
- 3.1 Random Variables
- 3.2 Moment Based Descriptors
- 3.3 Functions Describing Distributions
- 3.4 The Distributions and Related Packages
- 3.5 Families of Discrete Distributions
- 3.6 Families of Continuous Distributions
- 3.7 Joint Distributions and Covariance
- 4 Processing and Summarizing Data
- 4.1 Data Frames and Cleaning Data
- 4.2 Summarizing Data
- 4.3 Plots for Single Samples and Time Series
- 4.4 Plots for Multiple Samples
- 4.5 Plots for Multivariate and High Dimensional Data
- 4.6 Plots for the Board Room
- 4.7 Working with Files and Remote Servers
- 5 Statistical Inference Concepts
- 5.1 A Random Sample
- 5.2 Sampling from a Normal Population
- 5.3 The Central Limit Theorem
- 5.4 Point Estimation
- 5.5 Confidence Interval as a Concept
- 5.6 Hypothesis Tests Concepts
- 5.7 A Taste of Bayesian Statistics
- 6 Confidence Intervals
- 6.1 Single Sample Confidence Intervals for the Mean
- 6.2 Two Sample Confidence Intervals for the Difference in Means
- 6.3 Bootstrap Confidence Intervals
- 6.4 Confidence Interval for the Variance of Normal Population
- 6.5 Prediction Intervals
- 6.6 Credible Intervals
- 7 Hypothesis Testing
- 7.1 Single Sample Hypothesis Tests for the Mean
- 7.2 Two Sample Hypothesis Tests for Comparing Means
- 7.3 Analysis of Variance (ANOVA)
- 7.4 Independence and Goodness of Fit
- 7.5 Power Curves
- 8 Linear Regression and Extensions
- 8.1 Clouds of Points and Least Squares
- 8.2 Linear Regression with One Variable
- 8.3 Multiple Linear Regression
- 8.4 Model Adaptations
- 8.5 Model Selection
- 8.6 Logistic Regression and the Generalized Linear Model
- 8.7 Time Series and Forecasting
- 9 Machine Learning Basics
- 9.1 Training, Validation and Testing
- 9.2 Bias, Variance and Regularization
- 9.3 Supervised Learning Methods
- 9.4 Unsupervised Learning Methods
- 9.5 Reinforcement Learning and MDP
- 9.6 A Taste of Generational Adversarial Networks
- 10 Simulation of Dynamic Models
- 10.1 Deterministic Dynamical Systems
- 10.2 Markov Chains
- 10.3 Discrete Event Simulation
- 10.4 Models with Additive Noise
- 10.5 Network Reliability
- 10.6 Common Random Numbers and Multiple RNGs
- Appendix A How-to in Julia
- A.1 Basics
- A.2 Text and I/O
- A.3 Data Structures
- A.4 Data Frames
- A.5 Mathematics
- A.6 Randomness, Statistics and Machine Learning
- A.7 Graphics
- Appendix B Additional Julia Features
- Appendix C Additional Packages
- Bibliography 413
- List of code listings 415
- Index 421
- 1 Introducing Julia
-
Data Science with Julia - CRC Press Book
January 11, 2019 - 220pages
这本书带julia入门,讲了数据的预处理、可视化、有监督&无监督学习、与 R 的互操作TOC
Table of Contents
- Chapter 1 Introduction
- DATA SCIENCE
- BIG DATA
- JULIA
- JULIA PACKAGES
- R PACKAGES
- DATASETS
- Overview
- Beer Data
- Coffee Data
- Leptograpsus Crabs Data
- Food Preferences Data
- x Data
- Iris Data
- OUTLINE OF THE CONTENTS OF THIS MONOGRAPH
- Chapter 2 Core Julia
- VARIABLE NAMES
- TYPES
- Numeric
- Floats
- Strings
- Tuples
- DATA STRUCTURES
- Arrays
- Dictionaries
- CONTROL FLOW
- Compound Expressions
- Conditional Evaluation
- Loops
- Basics
- Loop termination
- Exception Handling
- FUNCTIONS
- Chapter 3 Working With Data
- DATAFRAMES
- CATEGORICAL DATA
- IO
- USEFUL DATAFRAME FUNCTIONS
- SPLIT-APPLY-COMBINE STRATEGY
- QUERYJL
- Chapter 4 Visualizing Data
- GADFLYJL
- VISUALIZING UNIVARIATE DATA
- DISTRIBUTIONS
- VISUALIZING BIVARIATE DATA
- ERROR BARS
- FACETS
- SAVING PLOTS
- Chapter 5 Supervised Learning
- INTRODUCTION
- Contents _ ix
- CROSS-VALIDATION
- Overview
- K-Fold Cross-Validation
- K-NEAREST NEIGHBOURS CLASSIFICATION
- CLASSIFICATION AND REGRESSION TREES
- Overview
- Classification Trees
- Regression Trees
- Comments
- BOOTSTRAP
- RANDOM FORESTS
- GRADIENT BOOSTING
- Overview
- Beer Data
- Food Data
- COMMENTS
- Chapter 6 Unsupervised Learning
- INTRODUCTION
- PRINCIPAL COMPONENTS ANALYSIS
- PROBABILISTIC PRINCIPAL COMPONENTS
- ANALYSIS
- EM ALGORITHM FOR PPCA
- Background: EM Algorithm
- E-step
- M-step
- Woodbury Identity
- Initialization
- Stopping Rule
- Implementing the EM Algorithm for PPCA
- Comments
- K-MEANS CLUSTERING
- MIXTURE OF PPCAS
- Model
- Parameter Estimation
- Illustrative Example: Coffee Data
- Chapter 7 R Interoperability
- ACCESSING R DATASETS
- INTERACTING WITH R
- EXAMPLE: CLUSTERING AND DATA REDUCTION FOR THE COFFEE DATA
- Coffee Data
- PGMM Analysis
- VSCC Analysis
- EXAMPLE: FOOD DATA
- Overview
- Random Forests
- Chapter 1 Introduction
-
[julia v0.4]Julia for Data Science
这本内容相关但是用的是 0.4 版本,比较老了。仅供参考TOC
1 The Groundwork – Julia’s Environment
2 Data Munging
3 Data Exploration
4 Deep Dive into Inferential Statistics
5 Making Sense of Data Using Visualization
6 Supervised Machine Learning
7 Unsupervised Machine Learning
8 Creating Ensemble Models
9 Time Series
10 Collaborative Filtering and Recommendation System
11 Introduction to Deep Learning -
vmls-julia-companion.pdf
这事那本线代教材附带的小册子,里面给了非常多的例子。我觉得你们如果有教学目的可以参考一下这种形式。
course
- MIT 的那些课,看你们有没有需要的。
- 其他的课偏数值分析和优化的多,统计的太少了。
- STAT 590F, Topics in Statistical Computing: Julia Seminar (Prof. Heike Hofmann), Fall 2014
- Northeastern University, Fall 2016:MTH3300: Applied Probability & Statistics
- 223490-0286, Statistical Learning Methods (Bogumił Kamiński): Fall 2017, Spring 2018, Fall 2018
收到,感谢提醒!
明白,感谢建议!目前确实是教学需求,然后应该是数据挖掘入门方面。
翻译目的为了教学,这个我会和导师讨论一下会有较高的产出,感谢提醒!
Thanks! I think it’s an excellent book with Julia tutorials, but imnot familiar with the contents. I find that data mining mainly contains techniques about EDA (Exploratory Data Analysis). To ensure the translation quality,we would consider it later. Thanks!
感谢建议!目前应该只要数据挖掘方面的。您推荐的第一本书感觉非常适合。
偏基础方面的,以及数值优化方面的需求目前我们导师这里还没有提。
课程中整合 Julia 入门的资料应该是必要的。小册子中的例子应该是很好的补充练习,
非常感谢!