AI Robustness Evaluation Techniques and Case Studies
in Various Domains
内容简介:
AI has become an essential technology in safety-critical fields such as disaster response, defense, and law enforcement. However, the question ""Is this AI truly robust?"" still lacks a well-defined and reliable evaluation standard. A simplistic approach that merely tests AI in various environments does not constitute an engineering-based validation. The process of evaluating AI robustness itself must be reliable. If evaluation results vary depending on the examiner’s subjective judgment, then this falls into the realm of talent, not technology.
This presentation introduces technical methodologies that enable objective and consistent AI robustness evaluation. We have established seven technical standards in Korea and conducted numerous pilot projects, including public data diagnostics, to accumulate real-world application cases. Additionally, we are currently researching evaluation methodologies for Large Language Models (LLMs) and will share some insights from this research. Moving beyond superficial discussions on AI Trustworthiness, this talk presents concrete technological approaches to ensuring the Reliability of AI testing and evaluation itself.
演讲提纲:
1. Defining AI Robustness Evaluation as an Engineering Challenge
· Why a reproducible and objective evaluation method is necessary instead of simple environmental testing
· The core issue of result variability due to the examiner’s experience or subjectivity
2. Technical Approaches to AI Robustness Evaluation
· AI robustness testing and validation techniques applicable across multiple domains
3. AI Robustness Evaluation Technologies in Korea
· Technological advancements developed through previous research
4. Real-World Pilot Projects and Public Data Diagnostic Cases
· Practical applications and outcomes of AI evaluation technologies in public and industrial sectors
· Key requirements for enhancing the reliability of AI evaluation
5. Global Expansion and Collaborative Opportunities
· Potential for global standardization of AI robustness evaluation technologies
· Proposed international collaboration models to strengthen AI Trustworthiness
6. Considerations and Exploration of LLM-Based AI Evaluation
· Current research directions for LLM-based AI evaluation techniques
Potential integration with existing AI evaluation frameworks
听众收益:
1. Master an Objective & Technical Approach to AI Robustness Evaluation
· Go beyond simplistic AI safety discussions and learn verifiable evaluation techniques.
2. Explore Practical Applications Through Technical Standards and Pilot Cases
· Move beyond conceptual discussions and discover real-world applications of proven technologies.
3. Identify Opportunities for Global Standardization and Collaboration
Discuss how to ensure the credibility of AI robustness evaluation and explore global cooperation opportunities.
1. Developed AI Trustworthiness Verification Techniques and established seven group standards with Korea’s Telecommunications Technology Association (TTA)
2. Extensive experience in AI system development and commercialization and published research on AI Trustworthiness at international conferences
3. Certified in Functional Safety Verification Frameworks
AFSP (Automotive Functional Safety Professional)
CACSP (Certified Automotive Cyber Security Professional)
4. Lead author of the AI Trustworthiness Development Guide, published by the Korean Ministry of Science and ICT, covering all domains: Smart Policing, Hiring, Generative AI, Autonomous Driving, Healthcare, and Public & Social Services
5. Master’s Degree in Electronic Engineering from Jeonbuk National University