The ML-AIM Group

Reinforcement Learning

O. Atan, W. R. Zame, M. van der Schaar, "Sequential Patient Recruitment and Allocation for Adaptive Clinical Trials," International Conference on Artificial Intelligence and Statistics (AISTATS), 2019. [Link]

O. Atan, W. R. Zame, M. van der Schaar, "Counterfactual Policy Optimization Using Domain-Adversarial Neural Networks," ICML 2018 Causal Machine Learning Workshop, 2018. [Link]

O. Atan, J. Jordon, M. van der Schaar, "Deep-Treat: Learning Optimal Personalized Treatments from Observational Data using Neural Networks," AAAI, 2018. [Link]

O. Atan, C. Tekin, M. van der Schaar, "Global Bandits; IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2018.

C. Shen, C. Tekin, M. van der Schaar, "Generalized Global Bandit and Its Application in Cellular Coverage Optimization," IEEE Journal of Selected Topics in Signal Processing, 2018.

O. Atan, W. R. Zame, Q. Feng, M. van der Schaar, "Constructing Effective Personalized Policies Using Counterfactual Inference from Biased Data Sets with Many Features," Submitted, 2017. [Link]

R. Hellman, C. Tekin, M. van der Schaar, V. Santos, "Functional Contour-following via Haptic Perception and Reinforcement Learning," IEEE Transactions on Haptics, 2017.

K. Kanoun, C. Tekin, D. Atienza, and M. van der Schaar, "Big-Data Streaming Applications Scheduling Based on Staged Multi-armed Bandits," IEEE Transactions on Computers, 2016. [Link] [Supplementary material]

S. Amuru, C. Tekin, M. van der Schaar and M. Buehrer, "Jamming Bandits - A Novel Learning Method for Optimal Jamming," IEEE Transactions on Wireless Communications, vol. 15, no. 4, pp. 2792-2808, Apr. 2016. [Link]

O. Atan, C. Tekin, J. Xu and M. van der Schaar, "Discovering Action-Dependent Relevance: Learning from Logged Data," Submitted, 2015. [Link]

C. Tekin, O. Atan and M. van der Schaar, "Discover the Expert: Context-Adaptive Expert Selection for Medical Diagnosis," IEEE Transactions on Emerging Topics in Computing, vol. 3, no. 2, pp. 220 - 234, 2015. [Link]

C. Tekin and M. van der Schaar, "Active Learning in Context-Driven Stream Mining with an Application to Image Mining," IEEE Trans. Image Process., vol. 24, no. 11, pp. 3666-3679, 2015. [Link]

O. Atan and M. van der Schaar, "Discover Relevant Sources : A Multi-Armed Bandit Approach," Submitted, 2015. [Link]

M. Wolf, M. van der Schaar, H. Kim and J. Xu, "Analysis and Decision-Making in Caring Environments for Adults with Special Needs Adults," IEEE Design & Test, Special Issue on Cyber-Physical systems for Medical Applications, vol. 32, no. 5, Oct. 2015. [Link]

C. Tekin and M. van der Schaar, "Distributed Online Learning via Cooperative Contextual Bandits," IEEE Trans. Signal Process., vol. 63, no. 14, pp. 3700-3714, 2015. [Link]

O. Atan, C. Tekin, M. van der Schaar, "Global Multi-armed Bandits with H?der Continuity," AISTATS, 2015. [Link]

V. Di Valerio, C. Petrioli, L. Pescosolido, M. van der Schaar, "A Reinforcement Learning-based Data-Link Protocol for Underwater Acoustic Communications ," ACM International Conference on Underwater Networks & Systems 2015 (WUWNet?5). [Link]

B.-G. Kim, Y. Zhang, M. van der Schaar, and J.-W. Lee, "Dynamic Pricing and Energy Consumption Scheduling with Reinforcement Learning," IEEE Transactions on Smart Grid, 2015. [Link]

O. Atan, A. Yiannis, C. Tekin, and M. van der Schaar, "Bandit Framework For Systematic Learning In Wireless Video-Based Face Recognition," IEEE J. Sel. Topics Signal Process., vol. 9, no. 1, June. 2014. [Link]

L. Song, C. Tekin, and M. van der Schaar, "Clustering Based Online Learning in Recommender Systems: A Bandit Approach," ICASSP 2014. [Link]

O. Atan, Y. Andreopoulos, C. Tekin, and M. van der Schaar, "Bandit Framework for Systematic Learning in Wireless Video-Based Face Recognition," ICASSP 2014.[Link]

B. Kim, Y. Zhang, M. van der Schaar, and J. Lee, "Dynamic Pricing for Smart Grid with Reinforcement Learning," 2014 IEEE INFOCOM Workshop on Communications and Control for Smart Energy Systems.[Link]

X. Zhu, C. Lan and M. van der Schaar, "Low-complexity reinforcement learning for delay-sensitive compression in networked video stream mining," in Proc. IEEE ICME, San Jose, USA, July 2013. [Link]

N. Mastronarde, K. Kanoun, D. Atienza, and M. van der Schaar, "Markov Decision Process Based Energy-efficient Scheduling for Slice-parallel Video Decoding," in Proc. ICME 2013, San Jose, USA, July 2013. [Link]

N. Mastronarde, K. Kanoun, D. Atienza, P. Frossard, and M. van der Schaar, "Markov Decision Process Based Energy-Efficient On-Line Scheduling for Slice-Parallel Video Decoding on Multicore Systems", IEEE Trans. on Multimedia, vol. 15, no. 2, pp. 268-278, Feb. 2013. [Link]

N. Mastronarde and M. van der Schaar, “Reinforcement learning for power management in wireless multimedia communications,” IEEE International Conference on Multimedia & Expo (ICME), July 11-15, 2011 [Link] (Also featured in the IEEE COMSOC MMTC R-Letter, Dec. 2011. [Link]
N. Mastronarde and M. van der Schaar, "Fast reinforcement learning for energy-efficient wireless communication," IEEE Trans. on Signal Processing, vol. 59, no. 12, pp. 6262 - 6266, Dec. 2011. [Link]

N. Mastronarde and M. van der Schaar, "Reinforcement learning for energy-efficient wireless transmission," ICASSP 2011. [Link]

R. Izhak-Ratzin, H. Park, and M. van der Schaar, "Reinforcement Learning in BitTorrent Systems," Infocom 2011 (mini conference). [Link]

N. Mastronarde and M. van der Schaar, "Online Reinforcement Learning for Dynamic Multimedia Systems," IEEE Trans. on Image Processing, vol. 19, no. 2, pp. 290-305, Feb. 2010. [Link]

N. Mastronarde and M. van der Schaar, "Online reinforcement learning for multimedia buffer control," ICASSP 2010. [Link]

Ulrich Berthold, Fangwen Fu, Mihaela van der Schaar, and Friedrich K. Jondral, "Detection of Spectral Resources in Cognitive Radios Using Reinforcement Learning," in Proc. IEEE Dyspan 2008 . [Link]