The O’Reilly Data Show Podcast: David Ferrucci on the evolution of AI systems for language understanding.
Ben Lorica is the Chief Data Scientist at O'Reilly Media, Inc. and is the Program Director of both the Strata Data Conference and the O'Reilly Artificial Intelligence Conference. He has applied Business Intelligence, Data Mining, Machine Learning and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services.
The O’Reilly Data Show Podcast: Lukas Biewald on why companies are spending millions of dollars on labeled data sets.
The O’Reilly Data Show Podcast: Reza Zadeh on deep learning, hardware/software interfaces, and why computer vision is so exciting.
The O’Reilly Data Show Podcast: Karthik Ramasamy on Heron, DistributedLog, and designing real-time applications.
The O’Reilly Data Show Podcast: Aurélien Géron on enabling companies to use machine learning in real-world products.
The O’Reilly Data Show Podcast: Francisco Webber on building HTM-based enterprise applications.
The O’Reilly Data Show Podcast: Max Ogden on data preservation, distributed trust, and bringing cutting-edge technology to journalism.
The O’Reilly Data Show Podcast: Anima Anandkumar on MXNet, tensor computations and deep learning, and techniques for scaling algorithms.
The O’Reilly Data Show Podcast: Parvez Ahammad on minimal supervision, and the importance of explainability, interpretability, and security.
The O’Reilly Data Show Podcast: Jason Dai on BigDL, a library for deep learning on existing data frameworks.
The O’Reilly Data Show Podcast: Adam Gibson on the importance of ROI, integration, and the JVM.
Putting deep learning into practice with new tools, frameworks, and future developments.
The O’Reilly Data Show Podcast: Greg Diamos on building computer systems for deep learning and AI.
From tools, to research, to ethics, Ben Lorica looks at what’s in store for artificial intelligence in 2017.
From deep learning to decoupling, here are the data trends to watch in the year ahead.
The O’Reilly Data Show Podcast: A look at some trends we’re watching in 2017.
The O’Reilly Data Show Podcast: Ion Stoica on building intelligent and secure applications on live data.
The O’Reilly Data Show Podcast: Vikash Mansinghka on recent developments in probabilistic programming.
The O’Reilly Data Show Podcast: Michael Franklin on the lasting legacy of AMPLab.
The O’Reilly Data Show Podcast: Dafna Shahaf on information cartography and AI, and Sam Wang on probabilistic methods for forecasting political elections.
The O’Reilly Data Show Podcast: Christopher Nguyen on the early days of Apache Spark, deep learning for time-series and transactional data, innovation in China, and AI.
The O’Reilly Data Show Podcast: Natalino Busa on developments in feature engineering and predictive techniques across industries.
The O’Reilly Data Show Podcast: Shaoshan Liu on perception, knowledge, reasoning, and planning for autonomous cars.
The O’Reilly Data Show Podcast: Dean Wampler on streaming data applications, Scala and Spark, and cloud computing.
The O’Reilly Data Show Podcast: Michael Li on the state of data engineering and data science training programs.
The O’Reilly Data Show Podcast: Rana el Kaliouby on deep learning, emotion detection, and user engagement in an attention economy.
The O’Reilly Data Show Podcast: Adam Marcus on intelligent systems and human-in-the-loop computing.
The O’Reilly Data Show Podcast: Jana Eggers on building applications that rely on synaptic intelligence.
The O’Reilly Data Show Podcast: John Akred on building data platforms and enterprise data strategies.
Techniques to address overfitting, hyperparameter tuning, and model interpretability.
Doug Cutting, Tom White, and Ben Lorica explore Hadoop's role over the coming decade.
The O’Reilly Data Show Podcast: Yishay Carmiel on applications of deep learning in text and speech.
The O’Reilly Data Show Podcast: Rajat Monga on the current state of TensorFlow and training large-scale deep neural networks.
Mike Loukides and Ben Lorica examine factors that have made AI a hot topic in recent years, today's successful AI systems, and where AI may be headed.
The O’Reilly Data Show Podcast: Rohit Jain on the challenges of hybrid data management systems.
The O’Reilly Data Show Podcast: Mike Tung on large-scale structured data extraction, intelligent systems, and the importance of knowledge databases.
The O’Reilly Data Show Podcast: Michael Armbrust on enabling users to perform streaming analytics, without having to reason about streaming.
The O’Reilly Data Show Podcast: Danny Bickson on recommenders, data science, and applications of machine learning.
The O’Reilly Data Show Podcast: Ira Cohen on developing machine learning tools for a broad range of real-time applications.
The O’Reilly Data Show Podcast: Mikio Braun on practical data science, deep neural networks, machine learning, and AI.
Apache Hadoop co-founders Doug Cutting and Mike Cafarella explore the future of Hadoop.
The O’Reilly Data Show Podcast: Duncan Ross on the evolution of analytics, data mining, and data philanthropy.
The O’Reilly Data Show podcast: M.C. Srivas on streaming, enterprise grade systems, the Internet of Things, and data for social good.
The O’Reilly Data Show podcast: Fang Yu on data science in security, unsupervised learning, and Apache Spark.
The O’Reilly Data Show podcast: Joe Hellerstein on data wrangling, distributed systems, and metadata services.
The O’Reilly Data Show podcast: Eric Colson on algorithms, human computation, and building data science teams.
Emerging trends in intelligent mobile applications and distributed computing.
The O’Reilly Data Show podcast: Vasant Dhar on the race to build “big data machines” in financial investing.
Promising topics in data that we'll be watching closely in the year ahead.
The O’Reilly Data Show podcast: A fireside chat with Ben Horowitz, plus Reynold Xin on the rise of Apache Spark in China.
The O’Reilly Data Show podcast: Evan Chan on the early days of Spark+Cassandra, FiloDB, and cloud computing.
The O’Reilly Data Show Podcast: Emil Eifrem on popular applications of graph technologies, cloud computing, and company culture.
The O’Reilly Data Show podcast: The Hadoop ecosystem, the recent surge in interest in all things real time, and developments in hardware.
The O’Reilly Data Show podcast: Tyler Akidau on the evolution of systems for bounded and unbounded data processing.
Consolidating data across silos improves business insight.
The O’Reilly Data Show podcast: Evangelos Simoudis on data mining, investing in data startups, and corporate innovation.
Smart cities and smart nations run on data.
Comprehensive metadata collection and analysis can pave the way for many interesting applications.
The O’Reilly Data Show podcast: Todd Lipcon on hybrid and specialized tools in distributed systems.
A new crop of interesting solutions for the complexity of operating multiple systems in a distributed computing setting.
The O’Reilly Data Show podcast: Dean Wampler on bounded and unbounded data processing and analytics.
The O'Reilly Data Show Podcast: Mike Cafarella on the early days of Hadoop/HBase and progress in structured data extraction.
Tools and learning resources for building intelligent, real-time products.
Logical and well-crafted collections of data video courses get you where you need to go.
The O'Reilly Data Show Podcast: Ben Recht on optimization, compressed sensing, and large-scale machine learning pipelines.
The O'Reilly Data Show Podcast: Ihab Ilyas on building data wrangling and data enrichment tools in academia and industry.
The O'Reilly Data Show Podcast: Phil Liu on the evolution of metric monitoring tools and cloud computing.
The O'Reilly Data Show Podcast: Patrick Wendell on the state of the Spark ecosystem.
A new partnership between O’Reilly and DataStax offers certification and training in Cassandra.
The O'Reilly Data Show Podcast: Gary Kazantsev on how big data and data science are making a difference in finance.
The O'Reilly Data Show Podcast: Anima Anandkumar on tensor decomposition techniques for machine learning.
A survey of the landscape shows the types of tools remain the same, but interfaces continue to improve.
The O'Reilly Data Show Podcast: Mikio Braun on stream processing, academic research, and training.
Things are moving fast in the stream processing world.
Tensor methods for machine learning are fast, accurate, and scalable, but we'll need well-developed libraries.
Angie Ma's startup, London-based ASI, runs a carefully structured “finishing school” for science and engineering doctorates.
David Blei, co-creator of one of the most popular tools in text mining and machine learning, discusses the origins and applications of topic models.
Understanding information cascades, viral content, and significant relationships.
The O'Reilly Data Show Podcast: Carlos Guestrin on the early days of GraphLab and the evolution of GraphLab Create.
We need primitives, pipeline synthesis tools, and most importantly, error analysis and verification.
In this O'Reilly Data Show Podcast: DJ Patil weighs in on a wide range of topics in data science and big data.
Drawing inspiration from recent advances in data preparation.
In this O'Reilly Data Show Podcast: Ion Stoica talks about the rise of Apache Spark and Apache Mesos.
In this episode of the O'Reilly Data Show Podcast, Jay Kreps talks about data integration, event data, and the Internet of Things.
Rajiv Maheswaran talks about the tools and techniques required to analyze new kinds of sports data.
From cognitive augmentation to artificial intelligence, here's a look at the major forces shaping the data world.
A new partnership between O’Reilly and Databricks offers certification and training in Apache Spark.
Learn simple ways to improve data models by cleaning up and tweaking the distribution of training data.
New frameworks for interactive business analysis and advanced analytics fuel the rise in tabular data objects.
Business users are becoming more comfortable with graph analytics.
Researchers and startups are building tools that enable feature discovery.
Many more companies want to highlight how they're using Apache Spark in production.
Casting a critical eye on the exciting developments in the world of AI.
An array of tools for tackling data visualizations.
It has roots in academic scientific computing, but has features that appeal to many data scientists.
It's an extensive, well-documented, and accessible, curated library of machine-learning models
Python and Scala are popular among members of several well-attended SF Bay Area Meetups
We are in the early days of productivity technology in data science
The inaugural Spark Summit will feature a wide variety of real-world applications
A general purpose stream processing framework from the team behind Kafka and new techniques for computing approximate quantiles.
A distributed, near real-time system simplifies the collection, storage, and mining of massive amounts of event data
Specialized tools run the risk of being replaced by others that have more coverage.
Tools simplify the application of advanced analytics and the interpretation of results
As data sizes continue to grow, interactive query systems may start adopting the sampling approach central to BlinkDB.
Compelling large-scale data platforms originate from the world of IT Operations
A new crop of data science tools for deploying, monitoring, and maintaining models
Graph data is an area that has attracted many enthusiastic entrepreneurs and developers
Visual analysis tools are adding advanced analytics for big data
Tachyon enables data sharing across frameworks and performs operations at memory speed
Researchers begin to scale up pattern recognition, machine-learning, and data management tools.
A variety of tools are making data science tasks easy to do in Python
Shark is 100X faster than Hive for SQL, and 100X faster than Hadoop for machine-learning
Spark is becoming a key part of a big data toolkit.