Highlights from the NeurIPS applied research conference
Capital One's AI research team recaps NeurIPS conference, including neural networks and explainable AI.
In 2024, AI continues to dominate the global innovation conversation, including the future of what is possible when the technology is responsibly and thoughtfully applied across industry and society. Financial services is among the high-impact sectors that is poised to benefit most from advances in AI applications. Capital One recognizes that solving the toughest challenges within AI and finance requires an open exchange of ideas amongst industry, academia and research organizations.
Each year, Capital One proudly supports NeurIPs, the world’s foremost AI research conference that combines workshops, symposia, poster presentations and tutorials fostering the development and exchange of the newest advances in AI. Capital One is also an active participant and contributor to NeurIPS and its community; our Applied Research team and our academic partners publish research and convene their peers to meaningfully advance the state of the art in AI in finance.
Here’s a recap of the highlights from Capital One’s NeurIPS 2023 engagement and perspectives.
Capital One NeurIPS research publication highlights
In 2023, our Applied Research team had several works accepted at NeurIPS. Here’s a summary of each:
Real-world dataset and class-imbalances
The challenges arising from highly class-imbalanced datasets are pervasive in the real world. To date, most of the deep learning research in this field focuses on crafting specialized objectives and sampling strategies. In this work, we instead tune common hyper-parameters to obtain state-of-the-art performance on highly imbalanced classification problems. Additionally, we characterize common failure modes and share guidance for mitigating the adverse effects of class imbalance on deep learning tasks.
A performance-driven benchmark for feature selection in tabular deep learning
The majority of academic tabular benchmarks are inadequate representations of real-world feature size and complexity, which often require engineering and/or selecting features to lift performance on downstream modeling tasks. This paper presents a new tabular benchmark specifically created to evaluate the effectiveness of feature selection for tabular deep learning. We utilize both real datasets and synthesize extraneous features to build the benchmark. We also develop a LASSO inspired gradient-based feature selection method and demonstrate its effectiveness on datasets that contain random, corrupted and second-order features.
The disagreement problem in faithfulness metrics
A core goal in explainable artificial intelligence (XAI) is to develop model agnostic methods for post-hoc feature attribution. In recent years, there has been a growing number of such methods, but the field lacks a principled framework for selecting the most faithful method for a given use case. In this work, we shed light on the disagreement problem across XAI methods by measuring the faithfulness of local explanations for tabular data classifiers.
Capital One hosts dinner and networking events at NeurIPS
Capital One’s on-the-ground NeurIPS presence is multifaceted. Attendees can stop by the Capital One expo room booth to learn about what it’s like to work with us, areas of research we’re focusing on, industry challenges we’re solving and academic and research programs underway. We also host recruiting and networking happy hour events for people to get to know our work, our team and our interests better on their own terms.
Each year at NeurIPs leaders from the Capital One Applied Research team also host a private salon-style dinner with select academic research partners and leading industry peers to discuss the state of the art in AI, including AI and generative AI trends, challenges and new advances across both industry and academia. This year’s dinner focused on everything from fine-tuning and pre-training to transformers, model accuracy, data security, cost and performance and more – as well as the latest in research advances. We always find incredible value in these dinners as participants come away with new perspectives, best practices and learnings, insights into research and methodology, and new connections with practical, actionable takeaways.
Papers we liked
Across keynote talks at this year’s conference, we noticed a consistent emphasis on growing trends in adaptable and composable architectures for multi-modal learning. In addition to exploring these frontiers, we continued to hone into our core research themes including tabular representation learning and synthetic data, and data quality.
Given the abundance of tabular datasets at Capital One, we are intrigued by the latest insights provided by When do neural nets outperform boosted trees on tabular data? We’re also excited by the workshop paper MultiTabQA, which represents a step towards natural language reasoning over enterprise-quality data. Curiosity has been piqued by the high efficiency and foundational capabilities of Tabular Prior Fitted Networks, including the latest generative variant, TabPFGen.
Synthetic data plays a crucial role in evaluating and enhancing generalization, robustness and explainability. Benchmark libraries such as Reimagining synthetic tabular data generation through data-centric AI: A comprehensive benchmark are examples of valuable resources to accelerate the maturity and utility of synthetic data tools. In addition, we are inspired by the collaborative spirit fostered by industry-specific datasets like Realistic synthetic financial transactions for anti-money laundering models.
Data quality is a core principle in developing Capital One AI/ML systems. Novel methods like TRIAGE: Characterizing and auditing training data for improved regression, Data selection for language models via importance resampling and An efficient dataset condensation plugin and its application to continual learning show potential to enhance the fidelity and efficiency of our modeling pipelines and pave the way for deeper fundamental insights into our data.
Pioneering AI advancements for real-time intelligent experiences in finance
As we envision the next evolution of the financial services industry, we are inspired by adaptable techniques like Aging with GRACE and composable, multi-modal architectures including MultiMoDN and Event Stream GPT to help lay the foundation for best-in-class, real-time intelligent experiences for Capital One’s customers.
We look forward to sharing more about our ongoing participation and engagement with the AI research community this year. Our Applied Research team is growing! Interested in joining a world-class team that is accelerating state-of-the-art AI research into finance to change banking for good?