Program

Konference Excel@FIT 2020 proběhne digitální formou pomocí webináře ve středu 27. 5. 2020 a představí přijaté autorské práce.

Program konference

9.00	Uvítání
9.05	Prezentace studentů (krátké prezentace nejzajímavějších příspěvků)
10.40	Přestávka (protažení, vyvětrání, doplnění hrnků a skleniček)
10.50	Diplomka je příležitost! Hmm, a k čemu přesně? (panelová diskuze s úspěšnými studenty i zástupci firem)
11.30	Vyhlášení výsledků komisí, výběru firem a hlasování veřejnosti (a vzdálené předání cen a darů firem)
11.50	Zakončení

Prezentace vybraných studentů

Platform for Cryptocurrency Address Collection

Vladislav Bambuch

Platform for Cryptocurrency Address Collection

Vladislav Bambuch

web scraping, cryptocurrencies, crypto crime detection, microservices, apache kafka, data streaming

Bezpečnost Počítačové sítě Webové technologie Zpracování dat (obraz, zvuk, text apod.)

The goal of this work is to build a platform for collecting and displaying metadata about cryptocurrency addresses from public and also dark web. To achieve this goal, the author uses web parsing technologies written in PHP. Challenges accompanying a website parsing are solved by scaling capabilities of Apache Kafka streaming platform. The modularity of the platform is accomplished by microservice architecture and Docker containerization. The work creates a unique way how to search for potential crypto criminal activities, that appeared outside of the blockchain world, by building a web page application on top of this platform (that serves for managing the platform and exploring the extracted data). The platform architecture allows adding loosely coupled modules smoothly where the Apache Kafka mediates communication of the modules. The result of this article is meant to be used for cybercrime detection and prevention. Its users can be law enforcement authorities or other agencies interested in reputations of cryptocurrency addresses.

EnzymeMiner: Web Server for Automated Mining of Soluble Enzymes

Simeon Borko

EnzymeMiner: Web Server for Automated Mining of Soluble Enzymes

Simeon Borko

enzyme mining, novel biocatalysts, web server

Bioinformatika

Millions of protein sequences are being discovered at an incredible pace, representing an inexhaustible source of biocatalysts. Despite genomic databases growing exponentially, classical biochemical characterization techniques are time-demanding, cost-ineffective and low-throughput. Therefore, computational methods are being developed to explore the unmapped sequence space efficiently. Selection of putative enzymes for biochemical characterization based on rational and robust analysis of all available sequences remains an unsolved problem. To address this challenge, I have developed EnzymeMiner – a web server for automated screening and annotation of enzymes that enables selection of hits for wet-lab experiments. EnzymeMiner prioritizes sequences that are more likely to preserve the catalytic activity and are expressible in a soluble form in heterologous host organism Escherichia coli. EnzymeMiner reduces the time devoted to data gathering, multi-step analysis, sequence prioritization and selection from days to hours. EnzymeMiner is a universal tool applicable to any enzyme family that provides an interactive and easy-to-use web interface freely available at https://loschmidt.chemi.muni.cz/enzymeminer/.

Static Deadlock Detection in Frama-C

Tomáš Dacík

Static Deadlock Detection in Frama-C

Tomáš Dacík

Static Analysis, Deadlock Detection, Frama-C

Testování, analýza a verifikace

Frama-C is a platform for static analysis of source codes written in the C language. It provides a wide range of analysers usually based on EVA - Frama-C's value analysis plugin. Despite some attempts to support analysis of multi-threaded code have been done in Frama-C, the whole platform is currently limited to analysis of sequential code only. In this paper, we present Deadlock, a new plugin of Frama-C focused on deadlock detection. Together with the core algorithm of deadlock detection, we present a technique our analyser uses to handle multi-threaded code partially as a sequential one, which allows us to improve the precision of our analysis by using existing plugins of Frama-C. In our experimental evaluation, we show that our tool is able to handle real-word C code with a high precision.

Speech Enhancement with Cycle-Consistent Neural Networks

Pavol Karlík

Speech Enhancement with Cycle-Consistent Neural Networks

Pavol Karlík

Speech Enhancement, Deep Learning, Cycle-Consistency

Robotika a umělá inteligence Zpracování dat (obraz, zvuk, text apod.)

Speech enhancement aims to improve speech intelligibility and overall perceptual quality of speech by using various algorithms. Neural networks (NNs) have become a standard approach for solving such problems. NNs are usually trained by comparing the network output to the target sample. In our work, we incorporate cycle consistency constraint during the training period to improve the network robustness --- we add another NN to the process. The second NN performs an opposite task --- its goal is to introduce noise to clean speech recording. The networks are trained in a cycle, each taking the output of the other network as an input. Cycle-consistency, among other things, causes the network to see a much larger variety of noisy data, which improves the network's robustness. We perform experiments on both paired and unpaired data, which is enabled by adding adversarial training to the training. The DNN models are evaluated by using an automatic speech recognition system. The speech enhancement models trained and evaluated in this work are based on a recent publication. The results have shown that adding cycle-consistency improves the models' performance significantly.

Enticing – Semantic Search Engine

David Kozák

Enticing – Semantic Search Engine

David Kozák

search engine, semantic enhancement, MG4J, compiler, indexation, searching, annotation, big data

Webové technologie

The topic of this paper is semantic searching over big textual data. It describes the design and implementation of a search engine Enticing that queries semantically enhanced documents efficiently and has a user friendly interface for working with the results. First, state of the art solutions along with their strengths and shortcomings are analyzed. Then a design for new search engine is presented along with a specialized query language EQL. The system consists of components for indexing and searching the documents, management server, compiler for the query language and two clients, web based and command line. The engine has been successfully designed, developed and deployed and is available via Internet. As a result of that, the possibility to use semantic searching is available to a wide audience.

Detekce paralelních chyb ve víceprocesových programech

Monika Mužikovská

Detekce paralelních chyb ve víceprocesových programech

Monika Mužikovská

ANaConDA, Dynamická analýza, Paralelní chyby, Víceprocesové programy

Testování, analýza a verifikace

Dynamická analýza se s úspěchem využívá pro detekci chyb ve vícevláknových programech. Algoritmy, které byly za tímto účelem navrženy, jsou ale často využitelné i pro víceprocesové programy. Žádný ze známých nástrojů pro dynamickou analýzu ale monitorování procesů nepodporuje. Cílem této práce bylo rozšířit nástroj ANaConDA o analýzu a monitorování víceprocesových programů. Výsledkem je implementace rozšíření, které za vývojáře analyzátorů řeší problémy spojené s oddělenými adresovými prostory a synchronizací pomocí semaforů. Rozšíření bylo využito pro úpravu analyzátoru AtomRace pro detekci časově závislých chyb nad daty ve víceprocesových programech a použito na experimenty se studentskými projekty z předmětu Operační systémy. Výsledky experimentů ukázaly, že se nástroj ANaConDA může stát vítaným pomocníkem při implementaci nejen víceprocesových projektů.

Benchmarking medical segmentation models with limited training sets

Kateřina Trávníčková, Oldřich Kodym

Benchmarking medical segmentation models with limited training sets

Kateřina Trávníčková, Oldřich Kodym

Segmentation, Deep learning, Medical data, Image restauration, Limited training set

Počítačová grafika Robotika a umělá inteligence Zpracování dat (obraz, zvuk, text apod.)

Deep learning based medical data segmentation methods can provide excellent results already. However, these results are obtained mostly thanks to the large training data sets. Obtaining the sufficient amount of correct annotations might be problematic in the medical field. This paper describes the problem of training medical segmentation models with limited annotations and proposes solutions to address the issue. We compare the baseline segmentation model group with two other model groups. These groups use different means to battle the lack of data problem. First group is pretrained in unsupervised manner and the second one uses human interaction in form of guidance clicks. We train 14 models for each group on subsets with varying number of patients. Segmentation model trained on small number of patients has better results when pretrained in unsupervised manner on the whole trainig set with 70 patients. Better results are obtained with the interactive method, where training on only two patients reaches Dice score 0.929 whereas the preitrained model reaches 0.830 and the baseline model only 0.749.

Infrastructure for Testing and Deployment of the Real-Time Localization Platform

Michal Ormoš

Infrastructure for Testing and Deployment of the Real-Time Localization Platform

Michal Ormoš

RTLS Systems, UWB, CI/CD, Indoor Localization

Testování, analýza a verifikace

Fast development and deployment of the software are the new phenomena of the era. It is not different in the field of real-time localization systems (RTLS). In our global world where the global positioning system (GPS) is the everyday utility, there is a necessity of localizing under the roof where the GPS cannot access. Here come the local position systems based on Ultra Wide Band (UWB), which bring the ultimate precision. This work solves the problem of fast delivery of the software responsible for the RTLS System. It produces a case study on how to develop, test, and deploy this system in the fast CI/CD environment with the help of DevOps principles. This requires introducing the new techniques and methods for how to validate and test the precision of these systems.

Inclusion of Regular Expressions with Counting

David Mikšaník

Inclusion of Regular Expressions with Counting

David Mikšaník

regular expressions, language inclusion, finite automata, counting automata

Testování, analýza a verifikace

We present an algorithm solving the inclusion problem for regular expressions with the counting operator limited to character classes, the so-called extended regular expressions (eREs), which are common in practice. Such regular expressions do not extend expressiveness beyond regularity, but allow one to succinctly express repeated patterns. Our algorithm is based on the transformation eREs into monadic counting automata (MCAs), i.e., finite automata with counting loops on character class where each counter is bounded. Similarly to the classical algorithm, we transform eREs into automata, but now we use MCAs instead of nondeterministic finite automata (NFAs). Following by building the product of MCAs and searching for a final state in the product. MCAs are compact representation of eREs because the number of states in MCAs does not depend on the bounds used in the counting operator, in contrast to NFAs where the number of states grows linearly. These bounds can be large in practice, thus MCAs are often significantly smaller than NFAs. We provide several examples for which the classical algorithm working with NFAs does not terminate in a reasonable amount of time, but our algorithm does. We also hope that our algorithm outperforms the classical algorithm in general, especially if the bounds of the counting operators are large.

Adaptive SYN Flood Mitigation Based on Attack Vector Detection and Mitigation Process Monitoring

Patrik Goldschmidt

Adaptive SYN Flood Mitigation Based on Attack Vector Detection and Mitigation Process Monitoring

Patrik Goldschmidt

TCP SYN Flood, DDoS Mitigation, Adaptive DoS Protection

Počítačové sítě

TCP SYN Flood is one of the most widespread DoS attack types performed on computer networks nowadays. The attack comes in many possible forms and several different mitigation methods to deflect it also exist. This paper discusses mentioned security incidents, various mitigation approaches, and presents a mechanism able to choose the most suitable method to mitigate the attack. The suggestion is made according to network traffic and the properties of mitigation methods. After the suggested method is deployed, the algorithm also monitors its behavior and may suggest a different strategy when the one currently in use proves to be ineffective. Our experiments have shown that the mechanism is able to successfully detect several attack variants and suggest a suitable method to deflect them while trying to minimize the impact on the end-user as much as possible. On the other hand, the suggestion accuracy is heavily dependent on available mitigation methods and their properties, which need to be set manually before the system can be used.