This paper describes a clinical decision support framework in multi-step health care domains that can dynamically recommend optimal treatment plans with respect to both patient outcomes and expected treatment cost. Our system uses a modified POMDP framework in which hidden states are not explicitly modeled, but rather, probabilistic models for predicting future observables given observation and action histories are learned directly from electronic health record (EHR) data. High quality treatment recommendations are found using a sampling-based tree growing approach which produces good results despite only exploring a fraction of the observation and action spaces. We describe the application of the approach to an ischemic stroke domain with clinical trial data (International Stroke Trial Dataset, 1993-1996). The dataset is of moderate size (N= 19, 435) and exhibits many characteristics of real EHR data, including noise, missing values, and idiosyncratic coding. The system's predictive model was chosen using cross-validated model selection from a set of several candidate learning methods, including logistic regression, Naïve Bayes, Bayes nets, and random forests. Simulations suggest that the optimized decisions improve patient outcomes, such as 6-month survival rate, compared to the decisions of human doctors during the study.