Student Computing Research Symposium is an annual research conference jointly organized by University of Ljubljana, University of Maribor and University of Primorska. Its goal is to encourage students to present and publish their research work in the domain of computer science and to facilitate cooperation and creativity.
We invite both BSc and MSc students to submit research papers from all fields of computer science. The length of the paper is limited to 4 pages. MSc students must write their papers in English, while BSc students may choose either English or Slovene. If your paper passes the acceptance threshold, it will be designated either for oral presentation or for poster presentation.
Having a paper presented at a conference and published in the conference proceedings is a reward in itself. Nevertheless, the authors of best papers, posters and/or presentations will receive additional awards contributed by the organizers and our industry partners.
Normalizing flows are a family of deep generative models and exact density estimators that have recently gained a lot of attention in computer vision and general machine learning applications. There are also ongoing concurrent efforts to improve Bayesian inference methods using deep models. In this talk, we will familiarize ourselves with normalizing flows and current developments that strive to connect them with Bayesian inference methods to obtain fast, accurate and efficient algorithms for various inference tasks.
In our work we tackle a problem of 3D object classification, which is a task traditionally closer to computer graphics rather than network analysis. There are several different approaches for solving this problem including deep learning, topological data analysis, graph theory etc. We use the Mapper algorithm that transforms the point cloud into a graph, which simplifies the data while obtaining the key properties of the structure. This algorithm is already used to solve such a task however, the features extracted from the graph were very limited. The novelty that we introduce is the calculation of some network properties on such graphs which are used for classification. The results show that the models have better classification accuracy scores when using network analysis attributes in addition to attributes from topological analysis however, it is challenging to determine exactly which features will perform well for a classification of objects.
In equation discovery tasks, strong emphasis is usually put on the generation of expressions. Classically, expressions are generated by using context-free grammars, evolutionary algorithms and similar approaches, but recently generators based on deep learning started to emerge. First attempts at generating discrete, structured data include variational autoencoders (VAE) for simple, unconstrained character sequences, and grammar VAEs, which employ context-free grammars to syntactically constrain the output of the decoder. In contrast, the hierarchical VAE (HVAE) proposed in this paper constrains the output of the decoder to binary expression trees. These trees are encoded and decoded with two simple extensions of gated recursive units. We conjecture that the HVAE can be trained more efficiently than sequential and grammar based VAEs. Indeed, the experimental evaluation results show that the HVAE can be trained with less data and in a lower-dimensional latent space, while still significantly outperforming other approaches. The latter allows for efficient symbolic regression via Bayesian optimization in the latent space and the discovery of complex equations from data.
Modern web technologies enable interactive visualization of graphs in a web browser. However, larger graphs, which are common in investigation data, bring numerous technical limitations in terms of transfer bandwidth and application responsiveness. In this paper we propose a machine learning based method to efficiently transfer graphs’ data between the client and the server. Graphs are stored in the investigation platform database on the server and are transferred to web browser on the client for interactive visualization and manipulation. For this reason, we utilize a concept called graph embedding. The main aim is to offer responsive web application due to less complex layout calculation algorithm on the client, less bandwidth requirements and incremental visualization of nodes during graph transfer. We performed distinct experiments, which demonstrate faster graph visualization on the client. We perceived up to 130 times faster layout calculation on the client compared to standard force graph algorithms with maximum 1000 nodes, while preserving adequate visualization accuracy.
Several papers have attempted to classify music genres based on the features extracted from sound recordings. However, none have implemented an ensemble classifier of different CNNs for various types of spectrograms. One thousand sound recordings from the GTZAN database were used for classification by the authors. Each sound recording was converted into three different spectrogram types, resulting in 3000 spectrograms. 85% of the spectrograms were used to train three CNN models, and the remaining 15% were used for testing. The individual CNN models formed a classifier ensemble, which combined the predictions of respective models into a single prediction based on the sum of the scores of respective genres. Since the accuracy of the classifier ensemble (54.67%) is higher than the accuracy of the individual classification models (44.00%, 53.33%, 26.67%), it was beneficial to combine the CNN models into one. The confusion matrix revealed some common errors in genre prediction. The somewhat low accuracy is likely a consequence of the truncated sound recordings. Although the classifier ensemble did not achieve high accuracy, it predicted the genre based on the spectrograms of the sound recording more accurately than a human. Weighting the individual CNN models could significantly improve the results.
In many cities around the world, large sums of money are invested in surveillance camera systems, but few optimize the benefits and costs of those investments, and thus the overall impact of surveillance cameras on crime rates. In this paper, based on a technique named Real-ESRGAN applied to a practical restoration application that has been enhanced by the efficient ESRGAN. It is a super-resolution method that was developed in blind super-resolution to reinstate low-resolution street images with unknown and complicated degradations. It can be applied for security purposes in surveillance systems. Since video surveillance systems typically capture low-resolution images in many areas, the detection and identification of objects are sometimes required. This task’s super-resolution is tough because image appearances vary depending on a variety of factors. The low resolution combined with poor optics is completely insufficient for identifying the subject of interest on the street, from a distance, in bad weather, or under any other limitations. Furthermore, to strengthen discriminator capability and create stable training dynamics, the U-Net discriminator was employed with spectral normalization. Hence, when compared to other experimental techniques, it can be demonstrated that this method delivers the best result. Experiment results show that super-resolution recovery of street images taken from a surveillance system is attainable with the following results: PSNR: 30.36dB and SSIM: 0.86.
Recently there has been a revolution in the field of text-conditioned image generation. Advances in neural network architectures as well as the availability of large open datasets have enabled those with access to powerful computing clusters to train models that can generate high-resolution images of virtually anything by text alone. Due to the significant effort of the open-source community, some of these models have recently become available to anyone. We will take a look at the recent history of these methods, how these models work and what we can do with them.
Gathering useful information from user interactions on social media is a challenging task but has several important use cases. For example, law enforcement agencies monitor social media for threats to national security, marketers use them for launching marketing campaigns, etc. Since most social media platforms do not provide a standardized way of monitoring their data, most analyses are car- ried out manually. We aim to expedite this process by constructing social network graphs, where analysts can visually determine what users and contents are important. In this paper we compare two different approaches for constructing such graphs (path-weighted and degree-weighted). We analyze the time complexity of graph construction and discuss the usefulness of their visualization. In order to empirically evaluate both approaches, a method was developed, which stochastically generates data adhering to rules that govern the generation of data on a social media platform. We found that constructing degree-weighted graphs is faster, although the visualization of a path-weighted graph can answer more questions about the dataset.
In this paper we present a sequential, parallel and distributed implementation of the infamous k-means clustering algorithm. We perform extensive testing of all three implementations on state the art hardware, and show the performance benefits of paralellization. The research was inspired by a use-case of reverse logistics optimisation of wood in Germany, which translates to a facility location problem. K-means is an heuristic approach that renders surprisingly good results compared to mathematical modelling approaches, which are usually not feasible in large inputs as they belong to the class of NP-hard problems.
In this paper, we focus on the center closing problem which is similar to the well-known 𝑘-center problem. Both problems are defined on a network with the goal of optimizing the worst-case service time for the clients, but with the difference that in the center closing problem several existing centers are closed in order to optimize total cost of operation. First, we show the NP-hardness of the problem. Afterwards, we describe several exact exponential algorithms for solving the problem. Finally, experimentally evaluate these algorithm on two test scenarios.
In this paper we study a new combinatorial game played on Young diagrams, called Column-Row. We devise a dynamic-programming algorithm for computing winning positions, or, more generally, Sprague-Grundy values. In turn, we identify winning strategies for several infinite families of starting positions. We prove those results formally, and conclude with a conjecture arising from this work.
Millions of people around the world continue to express their view on various topics on Twitter everyday. Such data is frequently used to generate and analyze networks of users, tweets and hashtags based on specific actions, such as tweets, retweets, mentions etc. In our study we focus on tweets related to the Russo-Ukrainian conflict. We combine sentiment and network analysis approaches to produce various important insights into the discussion of the conflict. We focused on the most influential actors in the debate as well as uncovering communities of users or hashtags which correspond to either side of the conflict. We discovered that the vast majority of users express support for Ukraine, and that the most important accounts belong to political leaders (e.g. Volodymyr Zelenskyy), relevant organizations (NATO) or media outlets, who actively report on the conflict (Kremlin News). Similarly, most of the relevant hashtags are used predominantly in a pro-Ukraine context, while many of them appear in tweets supporting Russia as well (e.g. #war, #Russia). We have identified numerous communities within the networks, which belong to discussions about the conflict being held in various languages or about various aspects, that the war indirectly affects (e.g. finance & cryptocurencies). Apart from a few very evidently pro-Russia communities, all the groups express support for Ukraine to at least some degree. Future research should focus on more thoughtful data collection and consequently thorough analysis of various aspects of the networks.
Sentiment analysis, also called opinion mining, is a highly restricted natural language processing problem. This paper presents the use of existing SloBERTa and XML-RoBERTa models on the Slovenian news corpus SentiNews 1.0 and compares their performance. The results are further compared to the results achieved by the Multinomial Naive Bayes and Support Vector Machines methods used in the dataset paper. The trained models are also applied to data collected from the social media platform Reddit, in order to analyse the sentiment of posts and comments from the Slovenian community.
Diagnoza Parkinsonove bolezni je aktualen raziskovalni problem, s katerim se soočajo zdravniki. Simptomi pri pacientih pogosto niso jasno izraženi, kar poveča možnost za napako pri oceni bolezni. Z uporabo metod strojnega učenja in razvojem namenske uporabniške programske opreme, lahko zdravnik pridobi dodatne informacije, ki zmanjšajo možnost napak pri diagnozi. Prav tako lahko na ta način pacientu ponudimo izbiro, da test opravi doma. Namen članka je predstaviti spletno aplikacijo in zaledni sistem za preprosto analizo posnetka tapkanja s prsti, ki ga izvaja pacient.
The iterated prisoner's dilemma is a heavily studied concept in the field of game theory. It was popularized by Axelrod, and it can be used as a tool for modelling complex interactions between self-interested entities. The goal of this paper was to study the impact the environment has on the development of populations of prisoner's dilemma strategies in a simulation, where individuals interact with each other and play the iterated prisoner's dilemma game. This was done from an ecological perspective, meaning the behaviour of strategies stayed static and didn't evolve between generations, in a simulation extending the iterated prisoner's dilemma game. Additionally, the paper presents two new strategies, which are evaluated with Axelrod's original tournament and with our simulation. The implemented simulation uses Axelrod's tournament as a fitness function and fitness proportionate selection for choosing the next generation's strategies. Both of our strategies are based on the n-Pavlov strategy. They achieved average results, but none of them improved the original ones.
Jointly organized by the computer science faculties of University of Ljubljana, University of Maribor and University of Primorska, under the support of the Slovenian Chapter of Association for Computing Machinery.
The conference will be held at the Faculty of Computer and Information Science of University of Ljubljana.