A 5-Point Minimal Solver for Event Camera Relative Motion Estimation

Event-based cameras are ideal for line-based motion estimation, since they predominantly respond to edges in the scene. However, accurately determining the camera displacement based on events continues to be an open problem. This is because line feature extraction and dynamics estimation are tightly coupled when using event cameras, and no precise model is currently available for describing the complex structures generated by lines in the space-time volume of events. We solve this problem by deriving the correct non-linear parametrization of such manifolds, which we term eventails, and demonstrate its application to eventbased linear motion estimation, with known rotation from an Inertial Measurement Unit. Using this parametrization, we introduce a novel minimal 5-point solver that jointly estimates line parameters and linear camera velocity projections, which can be fused into a single, averaged linear velocity when considering multiple lines. We demonstrate on both synthetic and real data that our solver generates more stable relative motion estimates than other methods while capturing more inliers than clustering based on spatiotemporal planes. In particular, our method consistently achieves a 100% success rate in estimating linear velocity where existing closed-form solvers only achieve between 23% and 70%. The proposed eventails contribute to a better understanding of spatio-temporal event-generated geometries and we thus believe it will become a core building block of future event-based motion estimation algorithms.


Introduction
Event-based cameras are bio-inspired vision sensors that naturally react to edges moving in the scene with microsecond temporal resolution and minimal motion blur.These intrinsic properties make events ideal for accurate relative motion estimation, especially under challenging motion and lighting conditions where standard cameras often fall short.Nevertheless, estimating motion from event measure-* indicates equal contribution An event camera observing two non-parallel lines and moving with constant linear and angular velocity.The events triggered by each line lie on a manifold, which we call an eventail.We derive a minimal 5-point solver to estimate the parameters of the manifold, which includes both camera motion and scene geometry.Clustering these events based on spatio-temporal planes as done in previous work [24,33] generates many spurious clusters (colorful points) with many outliers (grey points).Instead, eventails result in two large clusters with fewer outliers, and a velocity direction error of only 0.01 rad.
ments is an open challenge, as motion cues need to be inferred from the complex spatio-temporal structures formed by events, which typical vision-based algorithms struggle to grasp.Although event-based cameras have recently demonstrated unprecedented performance [41], the recent development of autonomous systems has created an increased demand for more accurate and reliable solutions which could better exploit the opportunity of improved motion modeling offered by these sensors.However, while with a traditional camera, solving for relative motion simply means aligning two views with sufficient overlap, this problem is not as straightforward to define for an event-based camera since views are not even available in the first place.Furthermore, even if the fields of view of the camera share substantial overlap at two different points in time, the structure of the perceived events at those two moments remains very much a function of local camera dynamics.In the worst case, if the camera ceases to move at all, no more events are triggered, and relative pose estimation becomes an ill-posed problem.It is intuitively clear that-for a dynamic vision sensor-the most fundamental problem of relative motion estimation is therefore given by the determination of local camera dynamics from a relatively short interval of events.The present paper introduces a geometric, deterministic solution to this problem.
The sparse and noisy nature of events has pushed the geometric vision community towards semi-dense approaches that make use of or optimize edge maps [37,13,52].Based on the assumption that the gradient map contains straight lines, a promising area of research, therefore, looks at line features as a possible alternative to assist the geometric solution of relative event camera motion.Works in this area [24,33], however, have inherent limitations that stem from a wrong assumption made during the initial feature extraction step.Indeed, they perform feature extraction independently of the relative camera displacement information, and they rely on a simple clustering strategy that models the space-time volume of events generated by a line under motion as a plane.However, as will be explained in detail in this work, lines do not form flat planes in the space-time volume of events, even if the camera undergoes constant linear and angular velocity, as evident from Figure 1.
It is thus clear that the problem of line feature extraction in the space-time volume of events can no longer be considered apart from the problem of dynamics estimation.In the present paper, we depart from this approximation and introduce a novel feature extractor that relies on a rigorously derived geometrical model of line-generated manifolds.Clustering the events of one manifold entails the identification of the manifold parameters, thereby leading to an implicit solution of the linear camera velocity from given angular rates measured by an Inertial Measurement Unit (IMU).Specifically, we make the following contributions: • We introduce a minimal geometric parametrization of the manifolds that contain all events generated by the observation of a single line under the assumption of locally constant, linear velocity.The parametrization involves the velocity components that are non-parallel to the line, as well as a minimal 3D parametrization of the line itself.• Based on this incidence relationship, we propose a minimal 5-point solver for the manifold parameters, and demonstrate its application in robust clustering for line feature detection and partial camera dynamics determination.• We conclude with an averaging scheme that fuses the partial camera dynamics observations from each linegenerated cluster into a complete estimate of the linear camera velocity, and thereby presents a rigorous theory for deterministic event camera motion initialization.
The present paper focuses on a theoretical understanding of line-generated manifold features, and thereby contributes to a better understanding of the geometry of the temporally dense-sampling event cameras.The theory is thoroughly evaluated on simulated data, and the advantage of the method is also demonstrated in a few concluding real-world examples.In particular, we show that using our method can consistently achieve a 100% success rate in estimating linear velocity where existing closed-form solvers achieving a success rate 23% and 70%.

Related work
Vision-based camera motion estimation is a long-studied problem, and there have been countless solutions for singlecamera, multi-camera, and visual-inertial scenarios.The interested reader is kindly referred to the survey of Cadena et al. [6] for a relatively recent overview.A work worth separate mention, though, is by Weiss et al. [45], who directly estimate camera velocity.
Event-based motion estimation.The present work looks at motion estimation with an event camera, for which the last decade has already seen a number of solutions.Weikersdorfer et al. [44] originally propose a 2D-SLAM system with a dynamic vision sensor by employing a particle filter.The same group also proposes an event-based 3D SLAM framework by fusing events with a classical frame-based RGB-D camera [43].Other event-based visual odometry systems make use of known depth or 3D structure [7,30,12,5,8], or are simply limited to the pure rotation scenario [13].Contrast maximization [13,11] is proposed as a unifying framework applicable to several event-based vision tasks.It draws substantial attention [22,39,26,32,31], but it is still limited to homographic warping scenarios.Full 6-DoF estimation is solved by Kim et al. [21] using a filtering approach, and Rebecq et al. [37] using an alternating tracking and mapping framework.Zhu et al. [51], Rebecq et al. [36], and Mueggler et al. [29] furthermore propose more reliable frameworks by fusing the measurements with an IMU.
More practical 6-DoF odometry and SLAM solutions keep being proposed by fusion with other sensors.Kueng et al. [22] combines the event camera with a standard camera to track features and build a probabilistic map.A similar sensor combination is used in Ultimate-SLAM [41], which improves robustness and accuracy by minimizing both vision and event-based residual errors.Zhou et al. [50] propose the first event-based stereo odometry system, while Zuo et al. [52] use a hybrid stereo setup composed of an event and a depth camera to realize DEVO, a semi-dense edge-tracking method.Finally, a recent work by Hidalgo-Carrió et al. [19] introduces EDS, a 6-DOF monocular direct visual odometry that combines events and frames.
Geometry-based motion estimation.Event-based motion estimation can be divided into optimization-based [37,24,29], filter-based [51,44] and learning-based [27,15] solutions.However, there is a lack of research on how fundamental geometry can be applied to event-based vision.For normal cameras, Weng et al. [46] and Hartley et al. [17] are among the first to introduce closed-form solutions for linebased motion estimation.Bartoli [49], monocular [35], and visual-inertial fusion [18] based solutions to real-time, line feature-based SLAM, respectively.
Of particular interest to this work are event-based methods that rely on line features.Yuan et al. [47] and Le Gentil et al. [24] present optimization-based solutions, while Peng et al. [33] use tri-focal tensor geometry to present the first closed-form velocity initialization method.However, the methods make use of event-based line-feature extractors [4,40] that fail to properly parametrize the line location in both space and time.Furthermore, their minimal solver needs at least 2 lines and therefore 10 events to make a single hypothesis.
Everding and Conradt [10] present a low-latency line tracker, while Mitrokhin et al. [28] present a learningbased method to track the surfaces generated by events.Ieng et al. [20] and Seok and Lim [38] finally propose model-based methods to fit and track the surfaces or curves generated by events.The perhaps most related work to ours is the work of Ieng et al. [20], who aim at understanding the spatio-temporal sub-space properties of the surfaces generated by events.However, to the best of our knowledge, our work is the first to establish a minimal parametrization of the surface as a function of the observable 3D spatial and dynamic parameters.Furthermore, we are the first to propose a deterministic minimal solver for this problem.

Theory
We assume to have a calibrated event camera under motion observing a scene that can be approximated by a set of 3D lines.We consider a temporal slice of events from which our objective is to initialize a first-order approximation of local camera dynamics.The motion of each observed line generates its own set of events, and-while the instantaneous reprojection of a line is still a line-each event cluster is generated by a line that moves and rotates through the space-time volume of events, thereby generating a DNAlike manifold distribution.In the following, we denote such a manifold an eventail1 .A set of eventails from which we wish to determine the camera velocity parameters is indicated in Figure 1.
The present section presents the theory of our method.We start with preliminaries and notations used throughout the paper.Next, we introduce a simple incidence relation that all events from one eventail need to satisfy.The constraint is transformed into minimal form, which not only reveals the intrinsic geometry of eventail manifolds and its dependency on spatio-temporal parameters, but also permits the derivation of a novel minimal solver that can be used for event clustering and a partial initialization of camera dynamics.To conclude, we present a complete velocity determination framework in which the partial observations from multiple eventails are merged into one common result.

Notations and preliminaries
We define the time interval of our slice of the space-time volume of events as t ∈ [t s −∆t, t s +∆t].The set of events observed during this interval is given by E = {E i } i=1,...,N , where E i represents the manifold cluster of events corresponding to the i-th 3D line L i .Each j-th event of the ith cluster e ij = {u ij , v ij , t ij , p ij } is given by its u ij and v ij coordinates in the image plane, its timestamp t ij , and its polarity p ij2 .The camera is assumed to be calibrated and we are given a function [u v] ⊺ = π (P) that projects points P ∈ R3 defined in the camera frame into the image plane.Conversely, we are given the inverse function f = π −1 (u, v) that transforms image plane coordinates into a unit-norm 3D direction vector defined in the camera frame and pointing towards the corresponding point in 3D.We furthermore define the first-order dynamics parameters v and ω, which represent the instantaneous translational and rotational velocity of the camera, respectively 3 .
A line L in R 3 can be represented by its direction vector d and a point P that lies on the line.The Plücker coordinates of this line are then defined as ⊺ ] ⊺ , and the vector m is referred to as the moment vector.If two nonparallel lines ⊺ intersect, the following product is required to vanish, i.e.
where ⟨• , •⟩ stands for the inner product.Note that, as long as the moment vector is strictly defined as the cross-product of a point on the line and the exact same direction vector that is used as the first three entries of the Plücker vector [34], the constraint is generally valid and there is no requirement on the norm of the direction vector.

Incidence Relationship
The entire problem is formulated relative to the camera frame at time t s .Let ⊺ be the Plücker coordinates of a line defined in the said reference frame.e ij ∈ E i is the j-th event triggered by the moving reprojection of the i-th line L i .Using our first-order dynamics approximation, we can easily define the position of the camera center at time t ij seen from the reference frame at time t s as , and the rotation that takes points from the camera frame at time t ij back to the reference frame at time t s as R [t ij ] = exp(⌊ω⌋ × (t ij − t s )).Here, ⌊ω⌋ × ∈ R 3×3 is the skew-symmetric matrix formed from the angular rate ω ∈ R 3 .Note that, similar to [33], we assume the angular velocity of the camera to be known, as it can be reliably measured by modern IMU's.We will present a sensitivity analysis to these measurements in the experimental section.Last, the direction vector pointing from the camera frame at time t ij to the 3D point on the line L i that triggered the event is given by The measurement of the event can now easily be expressed as a ray defined in the reference frame at time t s where R [t ij ] f ij is the direction of the ray, and C [t ij ] represents the origin of the ray (i.e. a point on the line).The ray is therefore described by the Plücker coordinates Finally, the incidence relation is given by (1), namely Next, we assume that rotational velocity is already known.In practice, this assumption is easily satisfied by the addition of an IMU.In the continuation, we therefore assume that events are directly represented by their unrotated, normalized coordinates where 2 indicates the geometry of the problem.The above incidence relation is already in polynomial form, and relates our measurements f ′ ij and t ′ ij to the remaining unknowns v and L i .However, it is not a minimal parametrization of the eventail manifold because Plücker coordinates are not minimal representations of lines, and the velocity component that is parallel to the line indeed has no influence on the incidence condition (i.e. it is unobservable, as sliding the camera along the direction of the line is unable to cause residual errors, a condition which is also known as the aperture problem).
< l a t e x i t s h a 1 _ b a s e 6 4 = " o N q 6 m l O r m 5 k m 3 f q S d K A S e j X v j 0 X g x X h e j B W O 5 c w B + w H j 7 B F M X k 4 0 = < / l a t e x i t >

C[t j ]
< l a t e x i t s h a 1 _ b a s e 6 4 = " G X M V s T e j 7 g F 7

Transition into a minimal form
The following three subsections are with respect to a single cluster, only, which is why index i is dropped from the formulation.The transition into minimal form consists of two steps.We start by replacing the representation of the unknown 3D line by a two-point-two-plane parametrization [16], which is minimal.More specifically, we define the 3D line L by the intersection points with the planes x = 1 and x = −1 defined in the reference frame at time t s .The two points in the reference frame are given by As required, the new, minimal parametrization has only four degrees of freedom.Note that the orientation of the support planes for the two points can be arbitrarily changed, and their normal vectors can be chosen such that they are parallel to the approximate direction of the line in the image plane at time t s (the latter can be set by for example considering the line connecting event samples).Without loss of generality, here we assume that the reference frame is already defined such that the support planes are given as defined above.
Next, we take care of the above-discussed problem that only part of the velocity vector is observable from a singleline observation under constant first-order dynamics.A minimal parametrization of the eventail manifold can only depend on part of the velocity vector, which notably are the components that are orthogonal to the line direction vector.To explicitly parametrize the vector as such, we introduce an intermediate, line-dependent reference frame in which the observable part of the camera velocity can be simply defined.The intermediate reference frame is given by e ℓ 1 is parallel to the line direction, e ℓ 2 is orthogonal to plane traversing the camera center and the line L, and e ℓ 3 is contained in the latter plane but pointing orthogonally away from the line.Only velocity components along e ℓ 2 and e ℓ 3 are observable, hence the velocity can be substituted by As required, this parametrization has only two additional degrees of freedom (i.e.v y and v z ).Note that R ℓ is not an orthonormal rotation matrix, but it only represents an orthogonal basis for the minimal definition of the velocity.Using the new parameters, incidence relation (4) becomes The two-point-two-plane parametrization and the intermediate, line-dependent reference frame are shown in Figure 2.

Elementary properties of the minimal form
Before introducing the solver, here we list two corollaries along with their corresponding proofs.
Corollary 1: The solution in terms of motion and structure parameters is scale invariant.In the case of our parametrization, the scale invariance is entirely reflected by the structure parameters.Scaling the line L will scale the velocity basis vectors e ℓ 2 and e ℓ 3 such that v ℓ y and v ℓ z remain unchanged.The latter are constant ratios.
Proof: It is sufficient to prove that if scaling the line points P a and P b by a factor k, the basis vectors e ℓ 2 and e ℓ 3 will be equally scaled by k.Let us denote the intersection points with planes x = −1 and x = 1 of the scaled line by P ′ a and P ′ b .P ′ a and P ′ b must lie on the line connecting kP a and kP b such that their first coordinate equals to -1 or 1, respectively.We therefore have Back substituting in (9), we obtain The new basis vector e ′ ℓ 2 is finally given by Next, it is easy to see that e

■
Corollary 2: There exists a solution duality, i.e. if {P a , P b , v ℓ } is a valid solution, then {P ′ a = −P b , P ′ b = −P a , v ′ ℓ = v ℓ } is also a solution.It notably corresponds to a reversal of the velocity and a placement of the line behind the camera.It is furthermore interesting to note that only the line parameters are affected by the solution duality. Proof: , and Given that the first coordinate of v ℓ is always zero, we therefore have Finally, for our incidence relation, we have

Five-point Minimal Solver
There exist six unknowns in our formulation.However, as mentioned in the above corollaries, there is an additional scale ambiguity in the line's support points.Hence, the inherent number of degrees of freedom is five, and a minimal solver for an eventail can be found by constructing five incidence constraints from five randomly picked events from the cluster.To remove the scale invariance, an additional constraint on the scale is added.Given that only the structure parameters are affected by the scale invariance, the scale constraint needs to include the related variables.We constrain the scale by adding the equation Using Gröbner basis theory [9], it is easy to find out that this problem has only two solutions.However, the fact that part of the solution variables is uniquely defined leads to some peculiarities in the derivation of the elimination template [23].Simply put, there are variable orderings for which the template leads to an equation where either v y or v z is a constant.More importantly though, after back substituting this variable into other equations, no simple solution for the remaining variables is apparent, but a second elimination template of substantial size needs to be solved.
To solve this problem, we analyzed all 720 possible monomial orderings and related elimination templates.We found out that 240 out of the 720 orderings directly lead to the unique solution for either v y and v z and are unlikely to reveal the double solution for the line's support points.The remaining 480 orderings and related elimination templates directly lead to the two solutions.The elimination templates can vary quite substantially in size and reach from 100×428 to 244 × 820.
The smallest template that directly leads to the two solutions is indicated in Figure 3, and has a size of 154 × 578.Note that, given that the elimination leads to only two solutions, no action matrix decomposition is needed.In turn, the actual solutions are straightforward to recover from the last few rows of the template.Among these, we select the one resulting in the following variable ordering: In particular, it leads to one quadratic equation in y b , which can be back substituted into three further equations to lead to univariate constraints in y a , z a , and z b , respectively.With the structure parameters known, any two events can be used to construct bi-variate linear constraints on v y and v z .
We embed the 5-point solver into RANSAC and use it to find the parameters of the eventail, which correspond to a partial dynamics recovery, as well as an event clustering.Sampling strategies and inlier criteria are discussed in Section 4.

Velocity averaging from multiple eventails
We are now moving back to multiple eventails, which is why the index i is reintroduced.The result we obtain from the clustering algorithm is a partial observation of the velocity v for each eventail.The observation is given by the two ratios v ℓ yi and v ℓ zi scaling the second and third basis vectors of R ℓi , respectively.Introducing the unobservable component κ i along the first basis vector, we have By multiplying the equation from the left with either the transpose of e ℓ 2i or e ℓ 3i , and exploiting the orthogonal property of the basis vectors, we easily obtain Stacking the result from all N observed lines and taking into account that each individual partial velocity observation is affected by an unknown scale factor, we obtain the linear averaging scheme where the additionally requested scaling factors are given by λ 1 , . . ., λ N .Multiplying the equation from the left with [A B] ⊺ , we obtain the form Applying the Schur complement trick, we easily obtain which lets us find v via Eigen decomposition of a 3×3 matrix.Note that V −1 is computed in linear time by simply inverting each element along the diagonal of V.

Experiments
We perform both simulation tests and real-world experiments.We first confirm the theoretical correctness of the proposed 5-point solver, its noise resilience, and discuss the implications of different event sampling strategies.Next, we test the influence of violations of the constant linear velocity motion assumption.We furthermore experimentally confirm the non-planar nature of the eventail manifold, and conclude with experiments on a public benchmark, demonstrating the advantage over existing bootstrapping methods.In order to evaluate the accuracy of the results, we adopt one of the criteria in [33], which is the direction error ϕ between the estimated and the ground truth velocities, since the scale is not observable.

Noise resilience on single cluster
We start by evaluating the performance of the 5-point solver under different noise setups over synthetic data, and discuss the impact of different event sampling strategies.Samples from individual manifolds are generated as follows.We first randomly generate two lines in the image plane that represent the location of the line at the beginning and the end of the temporal window of events.The interval length is set to 0.5 s.Next, we sample randomly directed linear and angular velocities of 1.0 m/s and 90 °/s magnitude, respectively.By factoring in the interval duration, we can deduce the relative camera location between the beginning and the end of the interval by linear motion model, and extract the corresponding 3D line via triangulation.The line is furthermore given a finite length in 3D.The virtual event camera has a resolution of 640×480 and a focal length of 320 pixels.The explained way of defining experiments ensures that the line passes through a sufficiently large area of the image canvas during the interval duration, thereby simulating well-posed problems where the camera exhibits sufficient displacement relative to the line in 3D.We randomly sample events, by first sampling a point on the 3D line, and then sampling the time stamp, at which that point is projected into the image plane.Each such projection is then denoted as an event.We analyze four strategies for sampling events in this way: • Random: Event timestamps and 3D points on the line are both sampled randomly.
• Temporal: the time interval is evenly divided into five sub-intervals, and each of the five events is assigned a random timestamp within an individual sub-interval.
• Spatial: the line is evenly divided into five segments, and making sure that each of the five events is triggered by a random 3D point on an individual sub-segment.
• Spatio-temporal: a combination of the spatial and temporal sampling strategies.
To conclude the experiment setup, we introduce three types of noise sources with different magnitudes, namely pixel noise, timestamp jitter, and noise to the camera's angular velocity, which is assumed to be given by an auxiliary sensor.The magnitude of the pixel noise and the noise on camera angular velocities is consistent within the same noise level but varies in direction.Zero-mean Gaussian noise is used for timestamp noise.Results are presented in Figure 4.Each box in the plot represents the mean errors from fifteen different geometry-motion configurations.Within each configuration, we conducted 100 evaluations with different event samples.Note that since the 5-point solver only takes readings from one cluster and produces partial observation, we compare the estimated velocity with the normalized ground truth velocity in the direction perpendicular to the line.Moreover, as is standard in SLAM evaluation pipelines, we do not report evaluations of the line parameters, as their error is subsumed in the motion errors.
Errors generally increase as noise levels increase.Without noise, our solver can always produce zero-error results, hence proving the theoretical correctness of the proposed method.There does not exist, however, a golden sampling strategy to produce minimum errors for different noise sources, though ensuring spatial distribution among the five sampled events is crucial to achieve accurate results.In the following RANSAC experiments, we therefore alternate between spatial and spatio-temporal sampling.

Multiple clusters and validity of motion model
We have developed an Event-based Geometric wireframe Generator, entitled EGG, in order to conduct experiments over longer time periods.The wireframes are composed of finite lines which simulate strong gradient locations in the scene.Note that the direction of the gradient does not need to be specified, given that we ignore polarity in our formulation.The simulator supports different types of motion models, including both spline-based and nonpolynomial motion.Events are generated whenever a projected line comes across a pixel center, thereby producing events with accurate timestamps.A realistic IMU model is also implemented with bias and random walk.In a nutshell, the EGG simulator we have developed is capable of pro- ducing accurate ground truth readings of poses and twists, corrupted IMU readings, as well as line-generated events with precise timestamps.
We now investigate the robustness and accuracy of the overall velocity determination from multiple eventails.We generate ten line segments within the volume Each event is labeled with its corresponding line to bypass clustering in the present experiment.In the next two sections, we evaluate the full RANSAC pipeline.Note, however, that we are still running RANSAC to fit the eventail parameters to each event cluster.We adopt the angular reprojection error [25] within the RANSAC algorithm.We conduct two types of motion, both of which violate the constant linear motion assumption.The first moves in a circular arc, with a tangential velocity of 1.0 m/s and with a radius ranging from 2 m to 10 m in increments of 2 m.Each sequence lasts 0.3 s, and so does the chosen time window.For each configuration, we generate twenty sets of events without noise and another twenty with noise.Perturbations are given by zero-mean Gaussian noise with a standard deviation of 1 pixel on location and 1 ms on timestamp.We also use corrupted IMU readings with a realistic bias.Finally, the ground truth camera motion has no rotation and faces in a constant direction.In the second simulation, we add acceleration with magnitude ranging from 0.1 m/s 2 to 0.5 m/s 2 to the linear velocity.Again, the time window is 0.3 s.
Figure 5 shows that without noise and model violations, our method can always produce zero-error results.As the velocity becomes less constant (larger acceleration, smaller radius), the error increases, but the proposed solver still shows high noise robustness in these conditions.

Performance under high dynamics
To demonstrate the full potential of our parametrization, we generate sequences of 1 s with constant but very significant linear and angular velocities.Two non-parallel lines with comparable depth, i.e. from [0, 0.75, 3] to [0, 2, 3] and from [0.38, −0.65, 3] to [0.75, −1.3, 3] respectively, are placed in front of the camera.The camera is moving towards the lines with a linear velocity of [0.4,0.4, 2.0]m/s, and with a self-rotation of [0, 0, −2π]rad/s.The noise level is kept the same as in the previous simulation.As showcased in the example in Figure 1, our method successfully fits the two manifolds from five events with inlier ratios of 67.16 % and 73.22 %, respectively, and a velocity direction error of only 0.01 rad.On the other hand, traditional planebased fitting fails in this case and splits up the set into 34 subsets.As explained next, this difference in clustering will impact the achievable final accuracy.

Real-world Experiment
We validate the method on four sequences from a public benchmark dataset [14] which feature clear line structures.The dataset provides VGA event recordings from a Gen3 Prophesee camera, 200 Hz ground truth camera poses from a MoCap system, and 200 Hz measurements from an XSens MTi-30 AHRS IMU.To increase efficiency we downsample the events in each sequence by a factor of ten.We then split the sequence into non-overlapping intervals of events with a 0.3 s duration each.To find individual line clusters we adopt two different approaches: In the first approach, we find clusters using cilantro [48] which finds spatio-temporal planes, as was used in [33], then when sampling 5 tuples of events from these clusters we count inliers over the entire event set, which has higher robustness than simply considering events within that cluster, as was done in [33].In this experiment we set the maximum number of event clusters to five.The second one is unique to our method, and consists of running RANSAC on the entire window of events until a maximal inlier ratio is found, then extracting that line and partial velocity estimate together with the inlier events from the event set, and repeating this step for a total of five times.We use this strategy whenever cilantro fails to provide sufficient clusters.This means that our method is strictly more robust than [33] which cannot recover when cilantro fails.The estimated velocity is obtained by velocity averaging from multiple eventails and is further compared with the ground truth velocity in direction error ϕ.Comparative results are listed in Table 1 reporting the mean and median error of successful samples in each sequence.Since CELC+opt fails on a subset of the sequences, we also restrict the evaluation of our method to the subsequence where CELC+opt is successful, and denote these results with an asterisk * .Where there are too few clusters, CELC+opt fails to output valid results in some samples.Results We find that while the success rate of [33] ranges between 23% and 70% our method consistently achieves 100%.Moreover, on the subset where [33] is successful both methods are comparable in terms of accuracy.This highlights that our methods signficantly improves on the robustness of existing closed-form solvers for linear velocity.Table 1.Real-world results on the VECtor benchmark [14].We report the success rate, i.e. , the percentage of sequence's sections where the algorithm outputs reasonable results, as well as the velocity direction error ϕ * only computed in these sub-sections.For our method, which always has 100 % success rate, we also report the error ϕ over the full sequence.

Conclusion and Future Work
The present work provides a new understanding of the geometry of line-generated events perceived under constant linear and angular velocities.We have derived a theoretically correct parametrization of the manifold-distribution of such events, and showed how the parametrization partially involves camera dynamics parameters.Our presented minimal solver for the manifold parameters has been successfully embedded into RANSAC, based on which we studied resilience against noise and motion model violations.Extensive tests on real-world data have validated the superior ability of our method to fit éventail manifold parameters, thereby increasing accuracy and the overall success rate of the presented bootstrapping algorithm.
Prospective future directions include extending the proposed averaging scheme with uncertainty estimates from multiple manifolds, as well as optimizing over the cluster inliers for improved robustness and accuracy of the results.Moreover, while we showed that our sequential RANSAC approach was sufficient for fitting multiple lines into a given event stream, more sophisticated robust multi-model fitting techniques like Progressive-X [3] could be used to minimize the number of outliers.Finally, the current method omits the event polarity and does not perform any temporal smoothing or fusion of accelerometer or gyroscope readings from the IMU.Extending the approach in this direction would make it acceleration-aware, as [8], and lead to improved modeling of the camera motion.Despite these limitations, we believe that our findings lay the cornerstones for highly successful, incremental smoothing-based motion estimation.

Figure 1 .
Figure1.An event camera observing two non-parallel lines and moving with constant linear and angular velocity.The events triggered by each line lie on a manifold, which we call an eventail.We derive a minimal 5-point solver to estimate the parameters of the manifold, which includes both camera motion and scene geometry.Clustering these events based on spatio-temporal planes as done in previous work[24,33] generates many spurious clusters (colorful points) with many outliers (grey points).Instead, eventails result in two large clusters with fewer outliers, and a velocity direction error of only 0.01 rad.
s h a 1 _ b a s e 6 4 = " 9 2 0 d m a K W H t R R 4 V u 1 f 8 e N k + F p c P c = " > A A A B 6 n i c b V D L S g N B E O y N r x h f U Y 9 e B o P g K e y K r 4 s Q 9 O I x o n l A s o T Z y S Q Z M j u 7 z P S K Y c k n e P G g i F e / y J t / 4 y T Z g y Y W N B R V 3 X R 3 B b E U B l 3 3 2 8 k t L a + s r u X X C x u b W 9 s 7 x d 2 9 u o k S z X i N R T L S z Y A a L o X i N R Q o e T P W n I a B 5 I 1 g e D P x G 4 9 c G x G p B x z F 3 A 9 p X 4 m e Y B S t d P 9 0 5

1 <
A 2 o p g x t O g U b g j f / 8 i K p n 5 S 9 8 / L Z 3 W m p c p 3 F k Y c D O I R j 8 O A C K n A L V a g B g z 4 8 w y u 8 O d J 5 c d 6 d j 1 l r z s l m 9 u E P n M 8 f 2 z e N i A = = < / l a t e x i t > x = l a t e x i t s h a 1 _ b a s e 6 4 = " o h e n m 4 G n b I M T L F b A 6 h b 5 y k y w d z A r H 2 6 2 7 C R B R r K s j y I z / m S I c o P R 8 N m a R E 8 5 k h m E h m s i I y x h I T b U o q m B K + L k X / k / Z Z 2 a m W L 2 7 P S 4 2 r r I 4 8 H M E x n I I D N W j A D T S h B Q Q E P M A T P F v K e r R e r N f l a M 7 K d g 7 h B 6 y 3 T 0 P r k V Q = < / l a t e x i t > v < l a t e x i t s h a 1 _ b a s e 6 4 = " Z w K k a J c d U 3 h H l n h M y V D 4 t d z 5 5 Q M = " > A A A B 6 H i c b V D L T g J B E O z F F + I L 9 e h l I j H x R H a N r y P R i 0 d I 5 J H A h s w O D Y z M z m 5 m Z o 1 k w x d 4 8 a A x X v 0 k b / 6 N g 1 g g P A M r / D m P D o v z r v z M W 9 d c f K Z I / g D 5 / M H 6 v 2 N B w = = < / l a t e x i t > y < l a t e x i t s h a 1 _ b a s e 6 4 = " I 6 l x 7 k J 3 J 2 c t P w b k + T F g r 5 / R V c U = " > A A A B 6 H i c b V D L T g J B E O z F F + I L 9 e h l I j H x R H a N r y P R i 0 d I 5 J H A h s w O D Y z M z m 5 m Z k 1 w w x d 4 8 a A x X v 0 k b / 6 N 6 N e l x h c y I s S W U K W 5 v J W x I F W X G Z l O w I X i L L y + T x l n Z u y x f 1 M 5 L l Z s s j j w c w T G c g g d X U I E 7 q E I d G C A 8 w y u 8 O Q / O i / P u f M x b c 0 4 2 c w h / 4 H z + A O y B j Q g = < / l a t e x i t > z < l a t e x i t s h a 1 _ b a s e 6 4 = " d l G 9 B I b f 0 A H r L d P U o i S f g = = < / l a t e x i t > f 0 j < l a t e x i t s h a 1 _ b a s e 6 4 = " L 6 n i 8 x 6 a D 2 U T C 0 W e 4 m 3 x h 0 N y x H Y

Figure 2 .
Figure 2. Incidence relationship between the line L with the twopoint-two-plane parametrization, and the event with the bearing vector f ′ j .Camera velocity is given in the line-dependent reference frame R ℓ = [e ℓ 1 e ℓ 2 e ℓ 3 ].

Figure 3 .
Figure 3.The chosen elimination template for our 5-point solver.

Figure 4 .
Figure 4. Results for the directional accuracy of the partially observed camera velocity as a function of noise in the event timestamps, the measured angular velocity, and the event locations.Each box denotes the range from the first quartile to the third quartile of the error distribution.The median is marked as the black line in the middle.

Figure 5 .
Figure 5. Average directional errors of fully estimated linear velocity over ten éventails.Results are evaluated for clean and noisy data, and different violations of the motion model assumptions.
and Sturm [1, 2] introduce complete line-based structure from motion.More recently, based on modern line-feature extraction methods such as LSD [42], Zhang et al., Pumarola et al., and He et al. propose stereo