Graphs are used in a wide variety of applicationdomains, from social science to machine learning. Graphalgorithms present large numbers of irregular accesses with littledata reuse to amortize the high cost of memory accesses, requiring high memory bandwidth. Processing in memory (PIM) implemented through 3D die-stacking can deliver this highmemory bandwidth. In a system with multiple memory moduleswith PIM, the in-memory compute logic has low latency and highbandwidth access to its local memory, while accesses to remotememory introduce high latency and energy consumption. Ideally, in such a system, computation and data are partitioned amongthe PIM devices to maximize data locality. But the irregularmemory access patterns present in graph applications make itdifficult to guarantee that the computation in each PIM devicewill only access its local data. A large number of remote memoryaccesses can negate the benefits of using PIM. In this paper, we examine the feasibility and potential of finegrainedwork migration to reduce remote data accesses insystems with multiple PIM devices. First, we propose a datadrivenimplementation of our study algorithms: breadth-firstsearch (BFS), single source shortest path (SSSP) and betweennesscentrality (BC) where each PIM has a queue where the verticesthat it needs to process are held. New vertices that need to beprocessed are enqueued at the PIM device co-located with thememory that stores those vertices. Second, we propose hardwaresupport that takes advantage of PIM to implement highlyefficient queues that improve the performance of the queuingframework by up to 16.7%. Third, we develop a timing model forthe queueing framework to explore the benefits of workmigration vs. remote memory accesses. And, finally, our analysisusing the above framework shows that naïve task migration canlead to performance degradations and identifies trade-offsamong data locality, redundant computation, and load balanceamong PIM devices that must be taken into account to realize thepotential benefits of fine-grain task migration.