Pitt HomeFind PeopleContact Us

"Food for Thought" Graduate Colloquium Series

In conjunction with its regular Coffee Hour, the CS-GSO presents a periodic series of graduate colloquia, short 20- to 30-minute student-led presentations aimed at informing, engaging, and enriching the student research community in the Computer Science Department.

Spring 2014 (Term 2144)

3:00pm, SENSQ 6329
Active Learning in Imbalance Datasets

Active Learning is a field of machine learning and its goal is achieve higher accuracy using few labeled instances. So, it reduces the number of training examples and has a good performance. Hence, in the current talk we are going over some basic machine learning concepts and some general algorithms of active learning, especially applied to imbalance datasets. Then, we are going to discuss some other possible ideas to apply in active learning.

3:00pm, SENSQ 6329
Automated Operator Placement in Distributed Data Stream Management Systems Subject to User Constraints

Traditional distributed Data Stream Management Systems assign query operators to sites by optimizing for some criterion such as query throughput, or network delay. The work presented in this paper begins to augment this traditional operator placement technique by allowing the user issuing a continuous query to specify a variety of constraints—including collocation, upstream/downstream, and tag- or attribute-based constraints—controlling operator placement within the query network. Given a set of constraints, operators, and sites; four strategies are presented for optimizing the operator placement. An optimal brute force algorithm is presented first for smaller cases, followed by linear programming, constraint satisfaction, and local search strategies. The four methods are compared for speed, accuracy, and efficiency, with constraint satisfaction performing the best, and allowing assignments to be adapted on the fly by the DDSMS.

3:00pm, SENSQ 6329
Shadows on the Cloud: An energy-aware, profit maximizing resilience framework for cloud computing

As the demand for cloud computing continues to increase, cloud service providers face the daunting challenge to meet the negotiated SLA agreement, in terms of reliability and timely performance, while achieving cost-effectiveness. This challenge is increasingly compounded by the increasing likelihood of failure in large- scale clouds and the rising cost of energy consumption. This paper proposes Shadow Replication, a novel profit-maximization resiliency model, which seamlessly addresses failure at scale, while minimizing energy consumption. The basic tenet of the model is to associate a suite of shadow processes to execute concurrently with the main process, but initially at a much reduced execution speed, to overcome failures as they occur. Two computationally-feasible schemes are proposed to achieve shadow replication. A performance evaluation framework is developed to analyze these schemes and compare their performance to traditional replication- based fault tolerance methods, focusing on the inherent tradeoff between fault tolerance, the specified SLA and profit maximization. The results show Shadow Replication leads to significant energy reduction, and is better suited for compute-intensive execution models, where up to 30% more profit increase can be achieved.

3:00pm, SENSQ 6329
Sparse Linear Dynamical System

Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning multivariate time series. However, in general, it is difficult to set the dimension of its hidden state space. A small number of hidden states may not be able to model the complexities of a time series, while a large number of hidden states can lead to overfitting. In this work, we study methods that impose an ℓ1 regularization on the transition matrix of an LDS model to alleviate the problem of choosing the optimal number of hidden states. We incorporate a generalized gradient descent method into the Maximum a Posteriori (MAP) framework and use Expectation Maximization (EM) to iteratively achieve sparsity on the transition matrix of an LDS model. We show that our Sparse Linear Dynamical System (SLDS) improves the predictive performance when compared to ordinary LDS on several multivariate time series datasets.

3:00pm, SENSQ 6329
Fully Topology-Aware Dynamic Load Balancing for Scientific Computation in Parallel Architectures

Scientific computation demands massive computing power, which commonly can only be delivered by equalizing the load across cores on multicore or manycore architectures while minimizing the communication among cores incurred by data dependencies. However, the hierarchical topology of modern parallel architectures requires scientific applications to map their communication pattern to the underlying hardware topology, posting a gap between the performance offered by modern parallel architectures and the performance achieved by the applications.

To bridge this performance gap, we propose a fully topology-aware (hyper)graph-based dynamic load balancer, FullTopoLB, which takes both the interconnect network and intra-node hierarchical topology into account while rebalancing the load, for scientific computation running in parallel architectures using a 3D-torus interconnect. We also carried out a simulation-based performance study in which we feed FullTopoLB with several synthetic datasets generated from a real turbulent combustion simulation dataset with different configurations on a simulated simplified Supercomputer. Our experiment results show that FullTopoLB is able to improve the performance by up to 30% and by up to 40% respectively compared with dynamic load balancers provided by Zoltan and Parmetis.

Terms

Top
Follow Pitt CS-GSO & SCI-GSO on Facebook

© 2009-2024 Pitt CS & SCI GSO