In this session diverse groups and initiatives who foster diversity in tech have the opportunity to present themselves in a round of lightning talks. Afterwards we will connect and network in a facilitated exchange of ideas and discussions. Technologists create products and services and in doing so affect our everyday lives now and in the future. If these products and services are supposed to serve our society as a whole, we need a lot of diverse and creative perspectives reflecting the richness of our communities. Therefore, we strongly believe that striving to increase diversity among technologists should be a major goal.On our pursuit for diversity, one of the most encouraging things we can do is to highlight those women who already work on great things within their respective industries, provide them platforms to share their knowledge and inspire by their example.In Vienna, a range of initiatives promoting gender diversity in STEM (science, technology, engineering & math) exist. To increase visibility as well as to facilitate exchange of ideas and getting-to-know-each-other amongst these initiatives, a session solely dedicated to these groups and projects will take place Friday afternoon.As attendee you have the chance to easily get in touch with different groups and meet fellow like-minded technologists.As representative of an initiative you have the chance to present your program and invite attendees to your events.People of all genders are welcome to attend our event.For more details and groups already enlisted, see the Women in Tech page at the PyDays 2019 website.
jackie ,
#queer #feminist part-time #sysadmin, #developer and #trainer. into: #linux #python #itsec #participatorydesign #transdisciplinarity #education ... IT machinist (aka sys & network admin) in diverse and changing contexts (e.g. ORANGE 94.0, a community radio station in Vienna).Also involved in organising Feminist Python Meetups and Feminist Linux Meetups and a member of the emancipatory ICT collective diebin.at.Doing workshops and trainings, and from time to time a little academic teaching, e.g. the seminar hacking gender, hacking technology.More infos on tantemalkah.at
Diversity matters! Not just because it makes the world a fairer place, but also because it improves welfare and safety for society as a whole. It is becoming increasingly clear that diversity is also good for business, as it improves creativity and also increases the company’s bottom line.The co-mentoring program “Women in data” is a platform to connect and learn together. This is our first official meetup and you are very welcome to attend!This meetup is open for women* and non-binary folks who are interested in or working with data. Data professionals are designing and implementing some of the most innovative products for the whole society. Nonetheless, data teams tend to be homogenous and mainly (if not 100%) composed of male members.This lack of diversity leads to a lack of different perspectives on product design, reducing profitability opportunities for businesses. Even worse, it can put human lives at risk, as is explained in the article: "The deadly truth about a world built for men – from stab vests to car crashes".So, the question is: How do we improve diversity in data teams? One of the most effective ways is through building a strong network and empowering others. In order to achieve this, it is important to get to meet female data professionals, to get to know each other’s work and to establish mutually beneficial learning relationships.Often we are asked to recommend fellow data professionals in different contexts but since the community is dispersed, even we as women face difficulties in coming up with a female colleague’s name and only male colleagues’ names come to our mind. If the network grows strong, this will hopefully not be the case anymore as we will know who to recommend because we know who we are and we know each other’s work.The co-mentoring program “Women in data” is a platform to connect and learn together. It is planned to have a meeting once per month, where we would discuss two projects per session helping each other with feedback and ideas.
Mari Plaza ,
Data becomes more useful when it is shared. In our talk we present our findings and future goals about transferring data in a privacy respecting and traceable way. We will lay out the technical foundation and demonstrate use-cases in a live-coding session by accessing Semantic Containers with Jupyter notebooks. Semantic Containers is a concept of bundling data, semantic description, and processing capabilities in a Docker container. This provides capabilities of validating access to data, automatically building provenance records, as well as ensuring data quality itself. Currently, this project is funded by FFG and we will present some of the already available use cases. This includes visualization of earthquake data provided by ZAMG (Zentralanstalt für Meteorologie und Geodynamik) and creating time-lapse videos of Sentinel 2 satellite data provided by the Copernicus programme of the EU. In this talk you will learn about the technical foundation of Semantic Containers and how to successfully integrate the concept in your daily routine as a Data Scientist with a focus on Python.
Christoph Fabianek ,
DI Dr. Christoph Fabianek, MBA studierte technische Mathematik an der TU Wien, absolvierte einen berufsbegleitenden MBA an der DU Krems und ist (ACC akkreditierter) systemischer Coach. Nach 3 Jahren Consulting bei AI Informatics und Siemens wechselte er in die Frequentis AG und ist dort seit über 10 Jahren in unterschiedlichen Bereichen als Projekt- und Produktmanager tätig. Daneben ist er in unterschiedlichen Startups aktiv und betreibt die Plattform OwnYourData.eu.
While the Python logging module makes it simple to add flexible logging to your application, wording log messages and choosing the appropriate level to maximize their helpfulness is a topic hardly covered in the documentation. This talk give guidelines on when to choose a certain log level, what information to include and which wording templates to use. The Python standard library includes a logging module that makes it simple to add flexible logging to your application. The technicalities of logging are well covered in the documentation and various blogs. However it is less clear how to choose log level depending on the situation and how to word log messages to maximize their helpfulness.In this talk guidelines and examples are given to answer the following questions for each log level (INFO, ERROR, CRITICAL, WARNING and DEBUG):- When to log at a certain level? - Which information to include for messages on a certain level? - What are common templates for messages on a certain level?
Thomas Aglassinger ,
Thomas Aglassinger is a software engineer and open source software enthusiast and co-host of the PyGRAZ Python user group in Graz. His professional experience includes developing medical workflow applications and business intelligence in finance. He holds a master's degree in information processing science.
In this contribution we tell the story of a document data extraction algorithm. We focus on challenges that had to be overcome during scale-up and how "internationalization" was already built-in by design in algorithm conceptualization. In particular, we discuss:1) Organization of the data science and back-end development team centered and how the algorithm and how the role(s) of the team members changed during the lifecycle of the algorithm 2) Challenges we encountered during go-live and lessons learnt 3) Patents and IP Protection
Captivating a group of children for a sustained period of time is notoriously hard. I will be exporing with you how the micro:bit can be used to engage a young audience with interactive demos and programming activities. Children love to physically interact with hardware. That is what is great about the micro:bit. It not only has built in LEDs and buttons but can also control bigger more exciting electronics like robotic arms. This allows for setup that gives students of all ability levels a challenge. Students that are beginners to programming can be given a more complete program or walk throught a worksheet that takes them through the process of controling the device (in this case a robotic arm) step by step. Students that have higher programming skills can be given a more bare bones / boilerplate source file and can write code to control the arm with less guidance.
This is the story of how we solved our performance and reliability issues while giving our users’ workflow a speed boost and saving their sanity by generating code automatically from human-readable specifications. Internally, we provide a configurable stream data processing tool to normalize loglines into a common structure from multiple sources. We developed a DSL and DSL-to-python compiler to let users express their needs without requiring them to have coding skills, but just their domain knowledge. At RadarServices, we deal with real-time log processing from disparate sources from various customers, and we need to process them in a normalized form for analysis. Due to the differences in data format this can be quite a challenge. Originally, we made use of handwritten normalizers which, of course, was time consuming and inefficient. Making use of our knowledge, we implemented from scratch a system that would translate our custom, easily understandable DSL to python bytecode, which allowed our analysts to greatly increase their productivity. To do this, we used the python AST standard library, Ply (lex/yacc), and our knowledge of compilers.
Dario Meloni ,
The asyncio library and the concept of coroutines provide a useful abstraction for python programmers. However the library can be hard to understand. It is not immediately obvious how to use it and under which circumstances you would pick it over threads or processes and why. This workshop tries to fill the gap. The workshop starts with an introduction that explains the underlying concept of coroutines and a bit of behind-the-scenes insight of how they are implemented in python. There will be a comparison with threads and processes and guidance on how to pick the right tool for the job.After the introduction the actual workshop begins. Participants are handed an python/jupyter notebook containing exercises regarding coroutines and the asyncio library. The exercises can be completed by participants at their own pace.Participants should be comfortable writing Python code, but don't have to be familiar with concurrency or parallelism in order to participate in the workshop.Prerequisites: You will need a machine with Python 3.7 installed.
Martin Natano ,
I have been using Python for more then a decade now. I've seen the language evolve quite a bit, witnessed the move to Python 3 and had the opportunity to grow as a programmer during that time.While most of the paid-for work I've done so far was done in high level languages I prefer to use lower level languages like C in my spare time to work on my obscure synthesizer projects and contribute to open source projects like OpenBSD. In my spare time I like to work on audio synthesizer software running on OpenBSD. So far I have mostly finished my organ synthesizer modeled after the famous electro-mechanic Hammon B3 organ, which took a lot of delving into obscure corners of the internet, dedicated to keeping the dream alive and, of course, the official Hammond service manual.I have contributed to Bitrig and I'm a contributor to OpenBSD.Sometime I like to do ASCII art (quite amateurishly).
This study aims at developing rainy clouds delineation schemes based on the high spectral resolution of Meteosat Second Generation - Spinning Enhanced Visible and Infrared Imager (MSG-SEVIRI). The possibility of developing precipitating cloud detection schemes was investigated, using the enhanced thermal infrared spectral resolution of the MSG satellite data. Two different rain clouds detection methods were proposed that incorporate spectral cloud parameters. The first is an Artificial Neural Network (MLP) model for rain and no rain discrimination and the second model is a Random Forest (RF) algorithm that relies on the correlation of spectral cloud parameters and rain information recorded from rain gauge stations. Both algorithms were developed using the scikit-learn python library. The rainy clouds detection schemes were trained using as rain information spatially and temporary collocated rain gauge data for several rain cases. The two rain detection schemes were validated against independent rain gauge measurements. During the training phase RF model presented the best performance among all the rain cloud delineation models. When evaluating the techniques performance against the independent rain gauge dataset RF algorithm still provides the best performance among all precipitating clouds detection schemes. The objective of this study is to investigate the possibility of the random forests ensemble classification technique to improve rain area delineation based on the correlation of spectral and textural cloud properties extracted from Meteosat Second Generation - Spinning Enhanced Visible and Infrared Imager (MSG-SEVIRI) data with rainfall data recorded by the National Observatory of Athens (NOA) meteorological stations. Two different precipitating cloud detection schemes are examined that use spectral cloud parameters along derived from the thermal infrared MSG satellite data to discriminate rain from no rain clouds. The first is an Artificial Neural Network (MLP) algorithm for rain cloud detection and the second scheme is a Random Forest (RF) algorithm that is based on the correlation of spectral cloud measures and rain information recorded from rain stations. The two ML approaches are implemented in python using the Scikit-learn package. The rain and no rain clouds descrimination models were calibrated using as rain information spatially and temporary matched rain gauge data for several rain events. The rain cloud areas detection schemes were calibrated and evaluated using spatio-temporally matched 15min observation datasets of seven SEVIRI thermal infrared channels and rain gauge data. In order to create the two precipitating cloud detection models SEVIRI thermal infrared data were used and acquired a 15 min time intervals with a spatial resolution of 3x3 km2 at sub-satellite point reaching 4 x 5 km2 at the study area. Rain gauge data from 88 stations of the National Observatory of Athens were used as rainfall information to train the models. Models were validated against independent rainy days that were not used for training the rain area delineation algorithms.
Apostolos Giannakos ,
Experienced researcher with more than ten years undergraduate and postgraduate research experience in the field of Satellite Meteorology, Satellite Remote Sensing and GIS. Knowledge in the development of satellite products. Research experience in satellite data analysis techniques, GIS and programming for meteorological and environmental applications.
Typos correction substitutes a big part of the code review process, and it can and should be automated. I present Typos Corrector - a tool for automatic correction of typos in source code identifiers in pull requests. It is powered by AI and source{d} Lookout - open source framework for assisted code review, that allows creating and running personal source code analyzers. Typos Corrector encompasses the knowledge obtained from 60 million identifiers present in the world's open source code. It adjusts to each repository to leverage the local naming conventions. Typos Corrector achieves 93% accuracy of corrections on our dataset. The talk will focus on the Typos Corrector's architecture, details of its pure Python implementation and problems which I hit with applying ML to the real-world problems.
OOP has been around forever. Yet, every day I see people writing spaghetti code with the occasional function included. Why do so many people discard OOP? How can you wrap your head around it and become a better coder?
Diversity matters! Not just because it makes the world a fairer place, but also because it improves welfare and safety for society as a whole. It is becoming increasingly clear that diversity is also good for business, as it improves creativity and also increases the company’s bottom line.The co-mentoring program “Women in data” is a platform to connect and learn together. This is our first official meetup and you are very welcome to attend!This meetup is open for women* and non-binary folks who are interested in or working with data. Data professionals are designing and implementing some of the most innovative products for the whole society. Nonetheless, data teams tend to be homogenous and mainly (if not 100%) composed of male members.This lack of diversity leads to a lack of different perspectives on product design, reducing profitability opportunities for businesses. Even worse, it can put human lives at risk, as is explained in the article: "The deadly truth about a world built for men – from stab vests to car crashes".So, the question is: How do we improve diversity in data teams? One of the most effective ways is through building a strong network and empowering others. In order to achieve this, it is important to get to meet female data professionals, to get to know each other’s work and to establish mutually beneficial learning relationships.Often we are asked to recommend fellow data professionals in different contexts but since the community is dispersed, even we as women face difficulties in coming up with a female colleague’s name and only male colleagues’ names come to our mind. If the network grows strong, this will hopefully not be the case anymore as we will know who to recommend because we know who we are and we know each other’s work.The co-mentoring program “Women in data” is a platform to connect and learn together. It is planned to have a meeting once per month, where we would discuss two projects per session helping each other with feedback and ideas.
Laura Vana ,
Post doctoral researcher at WU Vienna, working in the field of data science, quantitative risk management and applied statistics. Open source software supporter and developer.
This talk will use feather, ursa labs, and the latest RStudio release to demonstrate how R and Python can work together and try to move away from dogma. There is an ongoing fight between users of R and users of Python over which programming language is the best for data science. As a user of both, I think spending time elaborating pros and cons of the two is time wasted, especially because the discussion is usually led by dogmas. There is a lot going on to bring the two tools closer together by building bridges over the incompatibility gaps. Exchanging data between R and Python is a solved problem thanks to feather. Building on that thought, Wes McKinney and Hadley Wickham are collaborating to develop data science tools for R and Python. And with the latest RStudio version (1.2), Python might have found itself a proper IDE for data science. This talk will use feather, ursa labs, and the latest RStudio release to demonstrate how R and Python can work together and try to move away from dogma.
Clemens Zauchner ,
Studied business informatics in Innsbruck and data science in London. Worked with companies like OMV, easyJet, Sainsbury’s, and The Unbelievable Machine Company. Co-author of open source R package tableHTML, a tool to create and style HTML tables from R
In this talk we will explore Ray - a high-performance and low latency distributed execution framework which will allow you to run your Python code on multiple cores, and scale the same code from your laptop to a large cluster. Ray uses several interesting ideas like actors, fast zero-copy shared-memory object store, or bottom-up scheduling. Moreover, on top of a succinct API, Ray builds tools to your Pandas pipelines faster, tools that find you the best hyper-parameters for your machine learning models, or train state of the art reinforcement learning algorithms, and much more. Come to the talk and learn some more.
Jan Margeta ,
Jan Margeta is the founder of KardioMe, a Python aficionado, recently a speaker at PyCon Slovakia, and a white water kayaker. Jan did his PhD in machine learning for automated medical image analysis at Inria Sophia Antipolis and MINES ParisTech as a Microsoft research PhD scholar. Jan has a master of science degree in computer vision and robotics VIBOT. Now, he is putting all the research experience into real-world use to improve how we treat cardiac diseases. That is why he founded a company called KardioMe. Jan is passionate about using technology to push the boundaries of human knowledge, teaching computers to see, solving hard challenges with data, and making our planet a sustainable place.
Learn how to use the Cython compiler to speed up your Python code. Cython (https://cython.org/) is not only a very fast and comfortable way to talk to native code and libraries from Python, it is also a widely used tool for speeding up Python code. The Cython compiler translates Python code to C or C++ code, and applies many static optimisations that make Python code run visibly faster than in the interpreter. But even better, it supports static type annotations that allow direct use of C/C++ data types and functions, which the compiler uses to convert and optimise the code into fast, native C. The tight integration of all three languages, Python, C and C++, makes it possible to freely mix Python features like generators and comprehensions with C/C++ features like native data types, pointer arithmetic or manually tuned memory management in the same code.
Stefan Behnel ,
Stefan is a long-time Python user, Fellow of the PSF, and core developer of the well-known OSS projects Cython (https://cython.org/) and lxml (https://lxml.de/). He works on big data pipelines at TrustYou (https://www.trustyou.com/) and gives lectures and trainings on Cython programming and High-Performance Computing.
In this talk I want to show you how you can use Hydrogen + PWeave to get an alternative to Jupyter Notebooks that is fully diffable and allows you to leverage proper text editors such as Atom and hopefully in a not so distant future VS Code. Jupyter notebooks are a great tool to help with exploratory research. However, they have multiple shortcomings such as being hard to version control and a general lack of tooling such as debuggers, variable explorers and so forth. In this talk I will begin with showing you my current setup when I work with Jupyter Notebooks. I will then show you how Jupyter Notebooks could work based on Rnotebooks developed by RStudio. At the end I will briefly show you how you can replicate Rnotebooks for Python using Hydrogen + PWeave in Atom and the work that is still required to get a similar experience for VS Code.
Christoph Bodner ,
Lover of disc world novels, math and CS. Hey there, my name is Christoph and I work for the supermarket chain Billa, where I am responsible for all things data science. I have a Master degree in Quantitative Finance and a Bachelor degree in Business so I am neither scared to talk about martingale theory nor about acronyms such as IFRS, EBITDA or GDPR.
AI is not as objective as we may think - what influences the bias in AI? More and more AI applications affect our everyday life without our awareness, from an intelligent smartphone camera to algorithms that decide if a company will hire us. This talk covers all important points to consider when using and developing AI solutions. Artificial Intelligence (AI) is getting more and more involved in our daily life, often without us even noticing or being aware of its presence. It may come hidden in intelligent smartphone cameras, it may influence us while shopping or it may even decide whether we get hired by a company or not. In order for an AI to function as one, it has to be trained on data before it can actually act intelligently. It may behave in odd, unexpected ways, but one has to remember: Maybe the situation is completely new and it simply has not learned the correct behavior yet. Humans have to constantly learn and adapt over their lives, an AI, however, is trained only on a very limited set of data and thus may develop a completely different and biased view. Which reasons are there to cause such bias, what influences it and how can we attempt to reduce the bias? This talk sheds some light on bias in AI and how to overcome it when developing AI solutions.
Katrin Strasser ,
Katrin Strasser is an AI software developer with main focus on Natural Language Processing at Catalysts. This includes planning and developing software for AI projects, as well as Business Development. During her bachelor studies in Game Development & Augmented Reality and her master studies in Bioinformatics she was able to gain insights into a broad variety of IT related topics.
Einige kryptografische Verfahren, wie beispielsweise dem AES, gelten als verifiziert und sogar als „Quantencomputer“-sicher. Trotzdem ist man nicht gefeit vor Angriffen: Bei Gadget Attacks, Chosen- oder Known-Plaintext-Attacks werden Schwächen in der Implementierung von Verschlüsselung ausgenutzt. Anhand von Angriffsszenarien auf Mail-Verschlüsselung (EFAIL) wird veranschaulicht, welche Schwachstellen dazu führen können, den bewährten AES einfach „auszuhebeln“, und welche Lessions Learned beispielsweise bei älteren, verwundbaren TLS-Protokollversionen hilfreich gewesen wären, zentrale Problem- und Fragestellungen bei der Frage der Implementierung zu identifizieren.
Maha Sounble ,
Maha's ambition: to sensitize to information & IT security for an increasingly digitized world Maha is an information security expert at TODAY Experts under contract from A1 Telekom, the leading fixed and mobile network operator in Austria. She obtained her bachelor’s degree in integrated security and safety management at the University of Applied Sciences in Vienna. Currently, she focuses on managing security policies, awareness and education. Her ambition is to sensitize to information security and data privacy for an increasingly digitized world. She leads trainings about IT security and is following with keen interest the future direction of telecommunication in connection with security.
Einführung zur Python ctypes Library.
Philipp Schindler ,
Thomas König ,
It's often said that Python comes with batteries included, meaning that the standard library can do basically anything except for maybe conjure bacon for you (though I heard that's coming in 3.8). I don't think we fully appreciate the sheer vastness of it, though, so I went through it module by module looking for hidden gems (sorry, eggs). This is a by no means exhaustive compilation of the useful, the underrated, and the funny. Chances are you use the Python standard library on a daily basis -- or more likely, a more or less stable subset of it. The usual way we add things to the subset is by looking for a solution to a problem and ending up being pointed to a standard library module. That, however, means that the odds of you finding out that there is a whole module whose sole purpose is to tell you if a string is a Python keyword are very slim.This talk is not aimed at any specific level of Python experience. We'll go over modules that are interesting in some way: mostly for their usefulness, but in some cases also simply for being wonderfully weird in some way.
The asyncio library and the concept of coroutines provide a useful abstraction for python programmers. However the library can be hard to understand. It is not immediately obvious how to use it and under which circumstances you would pick it over threads or processes and why. This workshop tries to fill the gap. The workshop starts with an introduction that explains the underlying concept of coroutines and a bit of behind-the-scenes insight of how they are implemented in python. There will be a comparison with threads and processes and guidance on how to pick the right tool for the job.After the introduction the actual workshop begins. Participants are handed an python/jupyter notebook containing exercises regarding coroutines and the asyncio library. The exercises can be completed by participants at their own pace.Participants should be comfortable writing Python code, but don't have to be familiar with concurrency or parallelism in order to participate in the workshop.Prerequisites: You will need a machine with Python 3.7 installed.
This talk presents MovingPandas, a new Python library for dealing with movement data. Movement data analysis is a high-interest topic in many different scientific domains. Even though Python is the scripting language of choice for many analysts, there is no Python library available so far that would enable researchers and practitioners to interact with and analyze movement data efficiently. MovingPandas development is based on an analysis of state-of-the-art conceptual frameworks and existing implementations (in PostGIS, Hermes, and the R package trajectories). We describe how MovingPandas avoids limitations of SimpleFeature based movement data models commonly used in the GIS (geographic information systems) world. Finally, we present the current state of the MovingPandas implementation and demonstrate its use in stand-alone Python scripts, as well as within the context of the desktop GIS application QGIS. This work represents the first steps towards a general-purpose Python library that enables researchers and practitioners in the GIS field and beyond to handle and analyze movement data more efficiently.
Anita Graser ,
Anita Graser is a scientist, open source GIS advocate and data visualization geek. She serves on the QGIS project steering committee and OSGeo board and is author of several books about QGIS. Anita's background is in computer science with a specialization in geographic information science and she is currently working with the Center for Mobility Systems at the AIT Austrian Institute of Technology in Vienna. Anita also serves on the QGIS PSC and the OSGeo board of directors. In addition, she is teaching QGIS classes at UNIGIS Salzburg.She has published several books about QGIS, including “Learning QGIS” (currently 3rd edition), “QGIS Map Design”, and “QGIS 2 Cookbook”. Furthermore, she has developed tools such as the Time Manager and pgRoutingLayer plugin for QGIS.
How to use the power of Python to build a trading bot for Bitcoin in just one weekend. I will show a lot of code. I will show you how to implement a simple trading bot in Python. All parts of trading will be shown: connection to a Bitcoin exchange, buying and selling Bitcoin and the algorithm used to decide when to buy or sell Bitcoin. I will also show you how I build this trading bot in one weekend, so there will be also parts on how manage yourself and stay focused to actually finish your side project.
On the last Sunday of October you may get “one more hour of sleep” but may spend much more time debugging code dealing with the time zones, daylight saving time shifts and datetime stuff in general. This talk is 20% about programming and 80% about the world history and short-sighted political decisions having a long-term impact on the technology. After a short overview of the standard datetime module and its usage in different geographical contexts, we’ll have a look at the pytz library and discover all the 591 timezones it comes with. We’ll see why pytz is not a part of the standard library, as well as when and how and why this package gets frequent updates. At the end we'll have a look at a few pitfalls that may make you avoid timezones altogether.
Short survey on what is happening with Python at Bosch. Ranges from micropython to full-scale containerized microservices.
nibbler explores the concept of using existing Python syntax features such as type annotations and decorators to speed up code execution by running additional bytecode optimization passes that make use of runtime context provided through these means. Overview of CPython internals related to bytecode and bytecode execution.
Get started using docker to run your python. No previous docker knowledge required. We will use flask to make a small python app, put it in a docker container, and then have it automatically built for us on CircleCI. We will then look into how we can use Kubernetes to manage our container. Going through this doc: https://docs.google.com/document/d/165PC4KFmLELeqVk7U2pxFcepcfKndXK2xjsEfqRTqio/edit# Please bring a laptop so we can go through the document together.
Do you want to analyse historical newspapers with Python? Does training your CNN on historical postcard images sound nifty to you? Do you want to search within the Austrian Webarchive from the comfort of your home? We got you covered! We use prepared (and pre-shared) Jupyter Notebooks to illustrate: The data the Austrian National Library has to offer (for free) Which Python libraries make accessing and processing these data easier Some example applications using these data within Jupyter Individual participants are invited to either follow along the guided tour through some of the shared Notebooks with the rest, or they can work at their own pace through the provided material, asking questions as they arise. We'll publish a requirements.txt and the selected Notebooks 1 week before the workshop, the slides 1 day before the workshop here:https://labs.onb.ac.at/gitlab/labs-team/pydays19 Preliminary Rough Outline Overview Workshop Metadata & Catalogue Overview data formats, container formats, protocols Example SRU Example data harvesting OAI-PMH Example SPARQL Images & Text Overview IIIF Overview OCR formats Example download OCR text Example download pre-resized images for machine learning Example create IIIF collection from SPARQL query result Webarchive Overview Webarchive, API and content Example Wayback search via API Example full text search via API Requirements for Participants Laptop Connectivity Python 3 Working Jupyter Notebook installation Material We'll publish a requirements.txt and the selected Notebooks 1 week before the workshop, the slides 1 day before the workshop here:https://labs.onb.ac.at/gitlab/labs-team/pydays2019Language Slides and Notebooks in English, Workshop in English (or German, if all participants prefer that)Presenters Georg Petz is the senior software developer of the Austrian National Library's R&D department. Stefan Karner is the software developer of the ONB Labs project. Links https://labs.onb.ac.at
Stefan Karner ,
Stefan Karner is the lead programmer for the Austrian National Library's Labs project since 2017. Before settling on software development, he had been studying (amongst other things) history, political sciences and jazz vocals. He lives and works in Vienna.
Since the introduction of annotations, the typing module and myp, optional static typing became available for Python. We are going to discover the basic typing functionality, move on the more advanced typing features, learn how to configure mypy and write a small mypy plugin.
Philipp Konrad ,
In this talk, we will work towards demystifying the basics of information security with python: How does information get transmitted on the internet (networks, network layers, protocols), how is this information secured (encryption) and how can we analyse this information (network and intrusion detection). Subsequently we will apply scikit learn algorithms to analyse internet traffic files for possible intrusion. Hacking, in the classical use of the word, is probably the most mysterious and yet one of the most male dominated fields of the whole of IT. Many of us image a nerdy dude in a black hoody in some trashy basement happily hacking a bank, the state, or some big money or power business. There is however something very subversive about these people sitting in their basement, working against all controlling powers, something In this talk, we will work towards demystifying the basics of information security with python and on the way discuss this feminist hacker narrative including historical examples. The basics, that would be:that could make a fine feminist narrative. Introduction to Information Security with PythonHacking, in the classical use of the word, is probably the most mysterious and yet one of the most male dominated fields of the whole of IT. Many of us image a nerdy dude in a black hoody in some trashy basement happily hacking a bank, the state, or some big money or power business. There is however something very subversive about these people sitting in their basement, working against all controlling powers, something that could make a fine feminist narrative. In this talk, we will work towards demystifying the basics of information security with python and on the way discuss this feminist hacker narrative including historical examples. The basics, that would be: How does information get transmitted on the internet (networks, network layers, protocols), how is this information secured (encryption) and how can we analyse this information (network and intrusion detection). After settling the basics with some python examples, we will take a deeper plunge into the speakers speciality - data-science in intrusion detection. At its core, intrusion detection often boils down to anomaly detection and we will learn how we can train scikit learn algorithms on log files to see whether a network has been hacked.
Carina Karner ,
Thinks science is art & info-sec can be cool. Does all sorts of sciency, arty and hacky things. Organizes meetups, and political things. Can be found at hack*spaces, squats, and queer clubs. Usually with Clube Mate or coffee (black, without sugar) in her hands.
Keeping code readable and consistent can be challenging. Fortunately, Python ecosystem developed many tools helping with that! Let’s see what we can use to achieve best results. You’ll learn how linters can make your life easier… and what to be careful about not to make it harder instead. Nowadays there are so many Python linters it can be hard to choose the best solution for your project. I’ll show some most widely-used ones and compare them briefly. I’ll also share my experience with using them - both good and bad. We’ll end with some questions about usage of linters in different projects. I won’t give strict answers, but hopefully after this talk it’ll we easier for you to find them yourself.
Ania Kapuścińska ,
Software developer at Genomics England, previously at Clearcode. Working mostly with Python, always striving to learn more. Every day tries to make people lives a bit better. Loves astrophysics, travelling, books and coffee, sometimes in different order.
The Pandas soon realized there's no way they are going to survive the ordeals and hardships of this world, if they didn't finally and without the blink of an eye of hesitation pull themselves together, stop being the lazy fluffy beings, they have long been known for and start reorganizing their lives ASAP. They needed a fresh view over the world and its intrinsic mechanisms, light had to be shed upon the information they possessed about survival, in a few words, they had to start over. This is how in the midst of the forest a high performative library was coming to life, whose powerful toolkit would enable them a long lasting life of happiness and joy. This long-dreamed library should import the information they have been gathering about the world for long gone centuries and help them look at it through different eyes. They wanted to structure their world views and beliefs into sensible types and categories, remove from their genes their procrastinative behavioural patterns, drop them altogether. After laborious efforts of dealing with missing data about their surroundings, grouping and counting the meaningful rest, filtering the nonsensical superstitions, they could finally and, without doubt, point out with precision, where the bamboo sprouts were freshest, most succulent, fiber rich, absolutely scrumptious and the moment of the year, dictated by the moon calendar, when they are fluffiest, cosiest, most willing and irresistibly fall for one another and cuddle up.They put all this secret survival kit into easily understandable pictures and graphs for the dreamers out of them, who weren't prepared to put in all the effort of learning all those complicated symbols, just in order to survive and just wanted to admire the sky goddess, the moon.But wait, they didn't have a name for their grandiose library, so they just wanted to make a statement of being the most diligent creature of them all and called it, simply and unmistakably, pandas!
Ingrid ,
Do you want to analyse historical newspapers with Python? Does training your CNN on historical postcard images sound nifty to you? Do you want to search within the Austrian Webarchive from the comfort of your home? We got you covered! We use prepared (and pre-shared) Jupyter Notebooks to illustrate: The data the Austrian National Library has to offer (for free) Which Python libraries make accessing and processing these data easier Some example applications using these data within Jupyter Individual participants are invited to either follow along the guided tour through some of the shared Notebooks with the rest, or they can work at their own pace through the provided material, asking questions as they arise. We'll publish a requirements.txt and the selected Notebooks 1 week before the workshop, the slides 1 day before the workshop here:https://labs.onb.ac.at/gitlab/labs-team/pydays19 Preliminary Rough Outline Overview Workshop Metadata & Catalogue Overview data formats, container formats, protocols Example SRU Example data harvesting OAI-PMH Example SPARQL Images & Text Overview IIIF Overview OCR formats Example download OCR text Example download pre-resized images for machine learning Example create IIIF collection from SPARQL query result Webarchive Overview Webarchive, API and content Example Wayback search via API Example full text search via API Requirements for Participants Laptop Connectivity Python 3 Working Jupyter Notebook installation Material We'll publish a requirements.txt and the selected Notebooks 1 week before the workshop, the slides 1 day before the workshop here:https://labs.onb.ac.at/gitlab/labs-team/pydays2019Language Slides and Notebooks in English, Workshop in English (or German, if all participants prefer that)Presenters Georg Petz is the senior software developer of the Austrian National Library's R&D department. Stefan Karner is the software developer of the ONB Labs project. Links https://labs.onb.ac.at
Georg Petz ,
Georg Petz is responsible for software development in Austrian Books Online (ABO) and the implementation of a Linked Data Platform at the Austrian National Library. He holds a master’s degree in business and medical informatics and has more than 10 years’ experience in software engineering for digital libraries.
Aiming at complete code coverage by unit tests tends to be cumbersome, especially for cases where external API calls a part of the code base. For these reasons, Python comes with the unittest.mock library, appearing to be a powerful companion in replacing parts of the system under test. First and foremost, there will be a thorough discussion of the relevant use cases implemented in Python’s unittest.mock library. To move on, I will outline how this mocking functionality can be embedded in a pytest based test suite, amongst discussing the feasibility of replacing parts of the system under test. Eventually, I will discuss examples of production code unit tests that make use of the mock object library, thereby contributing to a solid understanding of the matter.
Rainer Schuettengruber ,
Seasoned database engineer and Linux enthusiast who believes that Python is the tool of trade when it comes to getting rid of boring tasks.
Merkel might not be familiar with 17th century British Parliamentary rules, but you will be after this workshop. You'll learn to analyse 200 years of British political debates with web scraping, data science and natural language processing. Merkel might not be familiar with 17th century British Parliamentary rules, but you will be after this workshop. Dr Maryam Ahmed (BBC News) will share the unique challenges of analysing the Hansard Archive, an online record of every Parliamentary speech from 1803 to the present day.You'll learn how to ethically scrape Hansard with the headless browser Selenium, and transform messy HTML into structured data with Pandas and BeautifulSoup. Maryam will explain how to find themes in political speeches with NLTK and Scikit-Learn methods including TF-IDF and Latent Dirichlet allocation. Spoiler: her talk will contain at least one clip of John Bercow shouting 'order'.
Maryam Ahmed ,
Dr Maryam Ahmed is a data scientist at BBC News, where she enjoys using machine learning to find news in messy political datasets and expose algorithmic bias. Dr Maryam Ahmed is based at BBC News, where she enjoys using data science and machine learning to find news in messy political datasets and investigate algorithmic bias. She is a strong advocate of scrutiny and transparency in the public sphere and has spoken on this topic at the Mozilla Festival, the Open News Unconference at the Royal Society of Arts and the Impacts of Civic Technology conference at the OECD. Maryam believes that education is the greatest leveller and enjoys running coding events for children from minority backgrounds.
This talk will use feather, ursa labs, and the latest RStudio release to demonstrate how R and Python can work together and try to move away from dogma. There is an ongoing fight between users of R and users of Python over which programming language is the best for data science. As a user of both, I think spending time elaborating pros and cons of the two is time wasted, especially because the discussion is usually led by dogmas. There is a lot going on to bring the two tools closer together by building bridges over the incompatibility gaps. Exchanging data between R and Python is a solved problem thanks to feather. Building on that thought, Wes McKinney and Hadley Wickham are collaborating to develop data science tools for R and Python. And with the latest RStudio version (1.2), Python might have found itself a proper IDE for data science. This talk will use feather, ursa labs, and the latest RStudio release to demonstrate how R and Python can work together and try to move away from dogma.
Dana Jomar ,
A learner and a data scientist, with a logical and a mathematical background started a journey in the world of big data and data science, during which Dana is able to work on solving a variety of business problems and use many of the data scientist’s must-haves tools. Recently Dana had the chance to contribute in the development of the open source R package tableHTML version 2.0.0.
This talk will describe experiences and lessons learned from tackling extremely demanding code. How to bring order to mismanaged code and elevate the code base to a standard that's acceptable in today's tech environment. The talk will tackle the problems in three parts: The Easy Wins Patterns and Antipatterns The Philosophy This talk will describe experiences and lessons learned from tackling extremely demanding code. How to bring order to mismanaged code and elevate the code base to a standard that's acceptable in today's tech environment. Python makes wonderful code accessible at our fingertips, but it also allows us to take a lot of liberty. When you start, code is beautiful and makes sense. With each step you walk on, you find more and more hacks showing up. Eventually, the brilliant codebase you've started with can not be seen under the mud. Not every code-base can be rewritten, not everything can be redone from scratch. Cool libraries often take a lot of liberty with code structure and make cross-integration difficult, if not impossible. Tests sound ideal, but in practice, they're just so hard to make right. Why is this, and how to change it? The talk will tackle the problems in three parts: The Easy Wins: What to do to instantly increase the code quality in your organisation. How? Why isn't it enough? Patterns and Antipatterns: How to identify code that "smells", how to replace it? How to integrate into old code, better? The Philosophy: What approach to set down for the future? Why to care? How to write replace-able code. How to prevent history from repeating. The talk will close with audience discussion and experiences, questions and proposals - building a collection of some on-premise tips and tricks.
Tin ,
Tin Marković is a software engineer, a full stack developer with strong software architecture bend. Specialized for designing systems, rather than components - he tries to spread knowledge of code as product, rather than ideal. Tin Marković is a software engineer working in Python, and a team lead in Kiwi.com After undergoing higher education in Bosnia and Herzegovina, and Croatia - Tin has interned in Slovakia and Croatia as a Computer Science and Engineering master level graduate. Professionally, he has worked with Python specialized ExtensionEngine (and in extension edX), then for Kiwi.com Becoming a team lead at Kiwi.com, Tin has managed to encounter challenges that come from interlocking dozens of systems, with complex logic that the travel industry presents. As a dedicated professional, Tin is more than eager to meet and converse with fellow attendees and speakers - looking to build long-lasting contacts and potential for fruitful cooperation.
A short introduction to all things Python. Beginner friendly, but depending on the audience we can go as deep as needed. Beginner workshop for up and coming Python programmers. Requirements: A laptop with Python 3 installed with some code editor installed (VS Code is probably a good baseline) and optionally also PyGame.
Reinforcement Learning is a powerful approach to machine learning which is based on experience without prior knowledge or guidance from experts. It enables an AI to independently create models of its environment and develop appropriate action strategies for goal-oriented tasks. The self-learning algorithms can be applied to time-dependent problems in a changeable and unknown environment. Applications include Game Ai (Alpha Go / Alpha Go Zero), Real-time Decisions and Robot Navigation. The aim of the lecture is to provide an insight into the theoretical and conceptual fundamentals of Reinforcement Learning, as well as a basic understanding of the best-known RL algorithms (SARSA, Q-Learning,..). The formal framework of the "Markov Decision Processes" will be discussed, allowing time-dependent and decision-based tasks to be represented in a meaningful way: The learning task is modelled as an interaction between the "environment" and an "agent" acting in it. The goal of the agent is to find strategies to maximize a previously defined reward through the environment. Thereby goal-oriented behavior can be translated as optimization problem, which is approximately solved by the experience gained by the agent.The methods are demonstrated with simple applications from deterministic and stochastic environments with code examples.
Daniel Pasterk ,
Python arguably is considered as the leading programming language used in AI, especially machine learning filed. During this short presentation you will have a chance to go with me thought the overview on the arguably the most popular Python framework and libraries used in ML and RL. The presentation will be the result of the short, qualitative survey as well as summary of selected, external sources. What's most interesting, and hopefully useful is the overview of the current stage of the art in the most popular ML & RL directions. During the talk you will be able to observe this subject from a couple of angles: - Plug-and-play / AI-as-a-Service solutions (Business) - Research (Academy) - The libraries that you basically like to use (Fun ;))
Kamila Stępniowska ,
I'm thinking about Python as a door opener to programming, data science & ML. My professional story is strongly connected with education, programming, data science, and diversity. All experience shared between the US and Europe.In previous years I have been creating B2B, data science education programs with deepsense.ai, support diversity as the Women Who Code Seattle Evangelist, Advisor at She’s Coding, SheWorx Steering Committee members. As the COO at Geek Girls Carrots (GGC) I have been building a global community of women in tech. In my couple years episode of being a PhD student, I have been doing the research on open source programming project at Chair of Sociology of Culture, Institute of Sociology (University of Warsaw).Currently, I'm working with 10Clouds on Business Development position. I also support 10Clouds ML team in terms of business strategy. I'm an amateur photographer, Japanese culture fan, runner living between Europe and the US.