Nowadays, temporal data is generated at an unprecedentedspeed from a variety of applications, such as wearable devices, sensor networks, wireless networks, etc. In contrast to suchlarge amount of temporal data, it is usually the case that onlya small portion of them contains information of interest. Forexample, for the ECG signals collected by wearable devices, most of them collected from healthy people are normal, andonly a small number of them collected from people with certain heart diseases are abnormal. Furthermore, even forthe abnormal temporal sequences, the abnormal patterns mayonly be present in a few time segments and are similar amongthemselves, forming a rare category of temporal patterns. Forexample, the ECG signal collected from an individual with acertain heart disease may be normal in most time segments, and abnormal in only a few time segments, exhibiting similarpatterns. What is even more challenging is that such raretemporal patterns are often non-separable from the normalones. Existing works on outlier detection for temporal datafocus on detecting either the abnormal sequences as a whole, orthe abnormal time segments directly, ignoring the relationshipbetween abnormal sequences and abnormal time segments.Moreover, the abnormal patterns are typically treated asisolated outliers instead of a rare category with self-similarity. In this paper, for the first time, we propose a bi-level(sequence-level/ segment-level) model for rare temporal patterndetection. It is based on an optimization frameworkthat fully exploits the bi-level structure in the data, i.e., therelationship between abnormal sequences and abnormal timesegments. Furthermore, it uses sequence-specific simple hiddenMarkov models to obtain segment-level labels, and leverages the similarity among abnormal time segments to estimate the model parameters. To solve the optimization framework, we propose the unsupervised algorithm BIRAD, and also thesemi-supervised version BIRAD-K which learns from a single labeled example. Experimental results on both synthetic andreal data sets demonstrate the performance of the proposedalgorithms from multiple aspects, outperforming state-of-The-Arttechniques on both temporal outlier detection and rarecategory analysis.