One of our goals is to bring Jupyter’s enterprise use cases and practices into one place.
The O’Reilly Data Show Podcast: A special episode to mark the 100th episode.
May 25 is an important day for data protection in the EU and elsewhere. Alison Howard explains how Microsoft has prepared for May 25 and beyond.
Pierre Romera explores the challenges in making 1.4 TB of data securely available to journalists all over the world.
Mick Hollison, Sven Löffler, and Robert Neumann explain how Deutsche Telekom is harnessing machine learning and analytics in the cloud to build Europe’s largest IoT data marketplace.
Watch highlights covering machine learning, GDPR, data protection, and more. From the Strata Data Conference in London 2018.
Jean-François Puget explains why human context should be embraced as a guide to building better and smarter systems.
Eva Kaili outlines the fundamentals of GDPR and applications of blockchain.
Answers to the three most commonly asked questions about maintaining GDPR-compliant machine learning programs.
The O’Reilly Data Show Podcast: Jason Dai on the first year of BigDL and AI in China.
Privacy-preserving analytics is not only possible, but with GDPR about to come online, it will become necessary to incorporate privacy in your data products.
The O’Reilly Data Show Podcast: Jerry Overton on organizing data teams, agile experimentation, and the importance of ethics in data science.
Both reproducible science and open source are necessary for collaboration at scale—the nexus for that intermingling is Jupyter.
Learn how Spark 2.3.0+ integrates with K8s clusters on Google Cloud and Azure.
A failed analytics startup post-mortem.
Discover how data-driven organizations are using Jupyter to analyze data, share insights, and foster practices for dynamic, reproducible data science.
The O’Reilly Data Show Podcast: Guillaume Chaslot on bias and extremism in content recommendations.
The two positions are not interchangeable—and misperceptions of their roles can hurt teams and compromise productivity.
Strata Data London will introduce technologies and techniques; showcase use cases; and highlight the importance of ethics, privacy, and security.
In an era where fake news travels faster than the truth, our communities are at a critical juncture.
The O’Reilly Data Show Podcast: Jesse Anderson and Paco Nathan on organizing data teams and next-generation messaging with Apache Pulsar.
A deep dive into model interpretation as a theoretical concept and a high-level overview of Skater.
Comcast’s system of storing schemas and metadata enables data scientists to find, understand, and join data of interest.
The O’Reilly Data Show Podcast: Ameet Talwalkar on large-scale machine learning.
Eric Colson explains why companies must now think very differently about the role and placement of data science in organizations.
Seth Stephens-Davidowitz explains how to use Google searches to uncover behaviors or attitudes that may be hidden from traditional surveys.
Ajey Gore explains why GO-JEK is focusing its attention beyond urban Indonesia to help people across the country’s rural areas.
William Vambenepe walks through an interesting use case of machine learning in action and discusses the central role AI will play in big data analysis moving forward.
Anoop Dawar shares principles successful companies are using to inspire an insight-driven ethos and build data-competent organizations.
Using silly data sets as examples, Janelle Shane talks about ways that algorithms fail.
How to find promising candidates for upskilling within your organization.
Tobias Ternstrom explains why you should objectively evaluate the problem you're trying to solve before choosing the tool to fix it.
Ben Lorica explores emerging security best practices for business intelligence, machine learning, and mobile computing products.
Alex Smola shares lessons learned from AWS SageMaker, an integrated framework for handling all stages of analysis.
Watch highlights covering machine learning, business intelligence, data privacy, and more. From the Strata Data Conference in San Jose 2018.
Li Fan shows how Pinterest is using AI to predict what’s in an image, what a user wants, and what they’ll want next.
Natalie Evans Harris discusses the Community Principles on Ethical Data Practices (CPEDP), a code of ethics for data collection, sharing, and utilization.
Dinesh Nirmal explains how real-world machine learning reveals assumptions embedded in business processes that cause expensive misunderstandings.
Nancy Lublin and Bob Filbin explore findings from crisis data.
The O’Reilly Data Show Podcast: Ofer Ronen on the current state of chatbots.
A product manager's guide to employing data as a feature.
The O’Reilly Data Show Podcast: Danny Lange on how reinforcement learning can accelerate software development and how it can be democratized.
A comparison of the accuracy and performance of Spark-NLP vs. spaCy, and some use case recommendations.
A step-by-step guide to building and running a natural language processing pipeline.
A step-by-step guide to initialize the libraries, load the data, and train a tokenizer model using Spark-NLP and spaCy.
A look at the new streaming SQL engine for Apache Kafka.
Attend a day-long exploration of Jupyter's best practices and practical use cases in business and industry.
Ingest the data you need in an agile manner.
The O’Reilly Data Show Podcast: Leo Meyerovich on building large-scale, interactive applications that enable visual investigations.
Alysa Hutnik discusses the Fair Credit Reporting Act, the Equal Credit Opportunity Act, the Gramm-Leach Bliley Act, and the FTC’s focus on FinTech.
How companies such as athenahealth can transform legacy data into insights.
Gain agility by loading first and transforming later.
A glimpse into what lies ahead for response automation, model compliance, and repeatable experiments.
The media and ad tech sessions at the Strata Data Conference in San Jose will dig deep into how media businesses are changing.
The O’Reilly Data Show Podcast: Mark Hammond on applications of reinforcement learning to manufacturing and industrial automation.
Regardless of country or culture, any solid data science plan needs to address veracity, storage, analysis, and use.
By packaging and delivering actionable data in applications, product managers can help users achieve their goals.
The ability to appeal may be the most important part of a fair system, and it's one that isn't often discussed in data circles.
Designing application architectures for real-time decisions.
Facilitating data exchange across the enterprise.