Queens College CUNY Computer Science Colloquium

- Wednesday, 09/11/2024, 12:15PM - 1:30PM

Room: Science Building, C203

Speaker: Tim Mitchell, Department of Computeer Science, Queens College of CUNY**Title**: Computing the Kreiss Constant of a Matrix**Abstract**: As famously introduced by H.-O. Kreiss about six decades ago, the now-called Kreiss constant of a square matrix bounds the magnitude of the transient behavior of the associated system of ordinary differential (or difference) equations. Although asymptomatically stable systems must eventually decay to a stable state, they nevertheless may incur surprisingly large transients before settling down, and such behavior can result in consequences ranging from undesirable to catastrophic. For example, large transients are innately power inefficient and may pose comfort or safety concerns, e.g., in transportation applications. Moreover, if the underlying physical system is nonlinear, large transient behavior in a stable linear model of that system may end up feeding nonlinear instabilities, which in turn could destabilize the actual physical process!Although Kreiss constants theoretically allow us to robustly predict how severely a linear dynamical system will undergo transient behavior, until recently, no methods existed to actually compute them. Instead, Kreiss constants have historically only been approximated using crude and ad hoc techniques that may not provide even a single digit of accuracy. A sticking point toward better methods has been that the value of a Kreiss constant is formulated via certain global continuous optimization problems that often are nonconvex and have multiple local optimizers, and general nonconvex optimization is NP Hard. However, by taking advantage of special structure specific to our problem of interest, we present the very first algorithms to compute Kreiss constants to arbitrary accuracy. In fact, using two very different techniques, we give two different classes of algorithms to reliably compute Kreiss constants.

- Monday, 09/16/2024, 12:15PM - 1:30PM

Room: Science Building, C205

Speaker: Jun Li, Department of Computer Science, Queens College of CUNY**Title**: Coding Techniques to Mitigate Stragglers in Large-Scale Distributed Computation**Abstract**: Large-scale computation distributed across servers is often slowed down by stragglersâ€”servers that delay or fail to complete tasks on time. In this talk, I will present two recent advancements in coding schemes to mitigate the impact of stragglers in distributed computing, focusing on matrix multiplication and gradient-based optimization. First, I will introduce a coding scheme for matrix multiplication that leverages subtask execution order to improve efficiency and recoverability. By probabilistically recovering incomplete tasks through coded ones, our method reduces encoding and decoding complexity while maintaining high recoverability. Next, I will present the ignore-straggler gradient coding (IS-GC), which tolerates an arbitrary number of stragglers in distributed machine learning. Using a graph-based decoding model, IS-GC maximizes gradient recovery and reduces training time.

- Monday, 10/07/2024, 12:15PM - 1:30PM

Room: Science Building, C201

Speaker: Teal Witter, NYU**Title**: Estimating Shapley Values via Leverage Score Sampling**Abstract**: Explaining machine learning predictions is crucial to deploying AI in high-stakes domains. Inspired by game theory, Shapley values are a widely used method for attributing model predictions to input features. Due to the exponential time (in the number of features) required to compute Shapley values exactly, popular algorithms like Kernel SHAP are used to approximate Shapley values. While Kernel SHAP is agnostic to the machine learning model and is remarkably effective in practice, there are no known guarantees on its performance. We propose Leverage SHAP: a theoretically motivated modification to Kernel SHAP that exploits leverage score sampling. In experiments, we find that Leverage SHAP can even outperform the highly optimized official implementation of Kernel SHAP. Further, we exploit the elegant connection between Shapley values and linear regression to show that Leverage SHAP is provably accurate: With high probability, Shapley values can be effectively approximated by Leverage SHAP in almost linear time.

- Monday, 10/21/2024, 12:15PM - 1:30PM

Room: Science Building, C201

Speaker: Benjamin Eysenbach, Princeton University**Title**: Self-Supervised Reinforcement Learning: Algorithms and Emergent Properties**Abstract**: In this talk, I will discuss recent work on self-supervised reinforcement learning, focusing on how we can learn complex behaviors without the need for hand-crafted rewards or demonstrations. I will introduce contrastive RL, a recent line of work that can extract goal-reaching skills from unlabeled interactions. This method will serve to highlight how there is much more to self-supervised RL than simply adding an LLM or VLM to an RL algorithm; rather, self-supervised RL can be seen as a form of generative AI itself. I will also share some recent work on blazing-fast simulators and new benchmarks, which have accelerated research in my group. Finally, I'll discuss emergent properties in self-supervised RL: preliminary evidence that we have found, and hints for where to go searching for more.

- Monday, 11/4/2024, 12:15PM - 1:30PM

Room: Science Building, C201

Speaker: Jiadong Lou, University of Delaware**Title**: Data Privacy Threats on Advanced Machine Learning Models**Abstract**: As machine learning becomes increasingly embedded in society, concerns regarding data privacy and security have grown. Models often rely on sensitive information to achieve high performance, making them attractive targets for adversaries seeking to extract private data. Users are also concerned about privacy leakage, especially when their data might be used without consent. This seminar focuses on data privacy threats on two prominent models: semi-supervised learning modelsand graph neural networks (GNNs).Semi-supervised learning, which leverages limited labeled data and large volumes of unlabeled data, has shown promising results but also raises privacy concerns about unauthorized data use. To address this issue, we propose a novel membership inference method that helps users determine if their data has been used in training. By leveraging metrics like inter-consistency and intra-entropy tailored for advanced learning algorithms, our method is highly accurate in identifying training data, addressing a critical privacy risk in semi-supervised learning. Graph neural networks are powerful tools for analyzing graph-structured data, such as social networks, but they are also vulnerable to link inference attacks, which exploit the relational nature of GNNs to infer sensitive connections between nodes. To counter this, we introduce the Graph Link Disguise (GRID) solution, an optimization-based defense mechanism that adds crafted noise to node prediction vectors. This approach effectively obscures private link information while maintaining model performance, achieving a superior balance between privacy and utility compared to existing methods. Our studies highlight significant privacy challenges in advanced machine learning models and demonstrate novel methods for mitigating data inference threats.

- Monday, 11/18/2024, 12:15PM - 1:30PM

Room: Science Building, C201

Speaker: Shiqiang Wang, IBM T. J. Watson Research Center**Title**:**Abstract**:

- Wednesday, 12/04/2024, 12:15PM - 1:30PM

Room: Science Building, C201

Speaker: Jiaxin Guan, NYU**Title**:**Abstract**:

The seminar is organized by Jun Li

Email Contact: jun.li@qc.cuny.edu