Optimizing Apache Spark & Tuning Best Practices

Dauer
Ausführung
Vor Ort, Online
Startdatum und Ort

Optimizing Apache Spark & Tuning Best Practices

Xebia Academy
Logo von Xebia Academy
Bewertung: starstarstarstarstar_half 8,6 Bildungsangebote von Xebia Academy haben eine durchschnittliche Bewertung von 8,6 (aus 104 Bewertungen)

Tipp: Haben Sie Fragen? Für weitere Details einfach auf "Kostenlose Informationen" klicken.

Startdaten und Startorte

computer Online: Virtual
27. Jun 2024 bis 28. Jun 2024
Details ansehen
event 27. Juni 2024, 09:00-17:00, Virtual, Dag 1
event 28. Juni 2024, 09:00-17:00, Virtual, Dag 2
computer Online: Virtual
22. Aug 2024 bis 23. Aug 2024
Details ansehen
event 22. August 2024, 09:00-17:00, Virtual, Dag 1
event 23. August 2024, 09:00-17:00, Virtual, Dag 2
placeWibautstraat 200, Amsterdam
7. Okt 2024 bis 8. Okt 2024
Details ansehen
event 7. Oktober 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 1
event 8. Oktober 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 2
placeWibautstraat 200, Amsterdam
25. Nov 2024 bis 26. Nov 2024
Details ansehen
event 25. November 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 1
event 26. November 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 2

Beschreibung

This live-virtual course is perfect for

Data and Machine Learning Engineers who deal with transformation of large volumes of data and need production-quality code. Expert Data Scientists can also participate: they will learn how to get the most performance out of Spark and how simple tweaks can increase the performance dramatically.

What will you learn during Optimizing Apache Spark & Tuning Best Practices?

After this training, you will have learned how Apache Spark works internally, the best practices to write performant code, and have acquired essential skills necessary to debug and tweak your Spark applications.

Program

Fundamentals

  • Spark execution model: Driver/Executors
  • Spark resource …

Gesamte Beschreibung lesen

Frequently asked questions

Es wurden noch keine FAQ hinterlegt. Falls Sie Fragen haben oder Unterstützung benötigen, kontaktieren Sie unseren Kundenservice. Wir helfen gerne weiter!

Noch nicht den perfekten Kurs gefunden? Verwandte Themen: Apache Spark, Apache Webserver, Data Mining, Hadoop und RabbitMQ.

This live-virtual course is perfect for

Data and Machine Learning Engineers who deal with transformation of large volumes of data and need production-quality code. Expert Data Scientists can also participate: they will learn how to get the most performance out of Spark and how simple tweaks can increase the performance dramatically.

What will you learn during Optimizing Apache Spark & Tuning Best Practices?

After this training, you will have learned how Apache Spark works internally, the best practices to write performant code, and have acquired essential skills necessary to debug and tweak your Spark applications.

Program

Fundamentals

  • Spark execution model: Driver/Executors
  • Spark resource managers (YARN, MESOS, K8s)
  • Understanding RDDs/DataFrames APIs and bindings
  • Difference between Actions and Transformations
  • How to read the Query plan (Physical/Logical)

Spark internals

  • Spark Memory model
  • Understanding persistence (caching)
  • Catalyst optimizer and Tungsten project
  • Shuffle service and how is shuffle operation executed
  • Concept of fair scheduling and pools
  • Java and Kryo serializer
  • Step into JVM world: what you need to know about GC when running Spark applications

Spark optimization: main problems and issues

  • The most common memory problems
  • Benefit of using early filtering
  • Understanding partition and predicate filtering
  • Join optimization
  • Combating Data skew (preprocessing, broadcasting, salting)
  • Understanding shuffle partitions: how to tackle memory/disk spill
  • Downside of using UDF’s
  • Executor idle timeout
  • Data formats examples

Moving to production

  • Debugging / troubleshooting
  • Productionizing your Spark application
  • Dynamic allocation and dynamic partitioning
  • Profiling your Spark application (Sparklint)
  • JVM profiler

Data Engineering Trainers

This Data Engineering training is brought to you by Xebia Data. Xebia Data is part of Xebia, just like Xebia Academy. Xebia Data works with experts in their field who are always on the lookout for the most innovative ways to get the most out of data. Your trainer is a data guru who enjoys sharing his or her experiences to help you work with the latest tools.

Yes, I want to know more about Apache Spark!

After registering for this training, you will receive a confirmation email with practical information. A week before the training we share literature if there's anything you need to prepare. See you soon!

Virtual or in-person training: This training can be delivered both in-person or online. When hosting the in-person training, we provide lunch, snacks and drinks to the participants. Accordingly there is a discount for virtual trainings.

Scale up your skills
Boost your career

Get the training you need to succeed, in every IT field.
Learn from the world's leading experts with public and in-company courses at Xebia Academy.

Werden Sie über neue Bewertungen benachrichtigt

Es wurden noch keine Bewertungen geschrieben.

Schreiben Sie eine Bewertung

Haben Sie Erfahrung mit diesem Training? Schreiben Sie jetzt eine Bewertung und helfen Sie Anderen dabei die richtige Weiterbildung zu wählen. Als Dankeschön spenden wir € 1,00 an Stiftung Edukans.

Es wurden noch keine FAQ hinterlegt. Falls Sie Fragen haben oder Unterstützung benötigen, kontaktieren Sie unseren Kundenservice. Wir helfen gerne weiter!

Bitte füllen Sie das Formular so vollständig wie möglich aus

(optional)
(optional)
(optional)
(optional)
(optional)
(optional)

Haben Sie noch Fragen?

(optional)

Anmeldung für Newsletter

Damit Ihnen per E-Mail oder Telefon weitergeholfen werden kann, speichern wir Ihre Daten.
Mehr Informationen dazu finden Sie in unseren Datenschutzbestimmungen.