This course Provides instruction on the processes and practice of�data science, including machine learning and natural language�processing. Included are: tools and programming languages�(Python, IPython, Mahout, Pig, NumPy, pandas, SciPy, Scikitlearn),�the Natural Language Toolkit (NLTK), and Spark MLlib. �
Audience: Architects, software developers, analysts and data scientists who�need to apply data science and machine learning on Hadoop.
Event Number: SCI-221
Available Languages: English (US),English (UK),French (Canada),German (Germany),Russian (Russia),Japanese (Japan),Chinese (Simplified),Italian (Italy),Portuguese (Brazil),French (France),Spanish (Latin America),Spanish (Spain),Portuguese (Portugal),Thai (Thailand),Dutch (The Netherlands)
Students must have experience with at least one programming or�scripting language, knowledge in statistics and/or mathematics, and a basic understanding of big data and Hadoop principles.�Students new to Hadoop are encouraged to attend the HDP�Overview: Apache Hadoop Essentials course.