| 
  			
				| 
					About 	
        					
                			 
							
							Wheelchair Basketball CoachingPHYSNET project: Physical Interaction Using the Internet: The Virtual Coach InterventionTele-immersive technology could significantly improve the access to knowledgeable coaches  thereby improving the ability to acquire the
							knowledge and skills necessary to competently engage in physical activity without injuries (stopping and turning can result 
							in soft tissue injuries to the hands (e.g., blisters), or sprained/strained fingers, and/or other upper 
							extremity injuries even with state of the art sport wheelchairs (with waist belts and anti-tipping features) used). In full tele-immersive environment, the learning of shooting techniques and training (hook pass,  
							figure eight dribble, one-on-one defense) will involve very limited co-action with other players to minimize the likelihood of risks, 
							and all such will be closely supervised remotely by the coach. Tele-immersive technology can create digital clones of people and objects at multiple geographical locations  and then place them into a shared virtual space in real time. This technology turns out to be very useful for citizens with limited proprioception (the sense of the relative position of   neighbouring parts of the body and locomotion). The tele-immersive environments can provide spatial cues that  would lead to re-gaining proprioception, as well as to supporting training of children, athletes and veterans with disabilities, and to facilitating physical interaction and communication of  the persons with disabilities with their relatives and others in their homes and work places. The team at NCSA/UIUC works closely with the Disability Services at the University of Illinois (DRES) and the wheelchair basketball players. They have researched and developed a prototype tele-immersive technology that addresses the  problems of  (a) adaptive  placement of stereo camera networks for optimal  deployment;  (b) robust performance under illumination changes using thermal infrared and visible spectrum imaging; and  (c) quantitative understanding of the value of  tele-immersive environments for  citizens with limited  proprioception.  The figures below illustrate the technical challenges of building and deploying robust, inexpensive and portable tele-immersive systems, as well as evaluating the value of such technologies for citizens with disabilities. The experience gained from the current work focused on wheelchair basketball coaching and proprioception applications has opened the door to studying physical interactions using the Internet.  We colaborate with 
							Mike Frogley, a headcoach of 
							men's and women's basketball teams at the University of Illinois. How Does the System Work? 	
   					
                			
							Required infrastructure for a tele-immersive system includes everything from networking
							to stereo camera rigs, software controlling the operation of the cameras, calibration 
							image acquisition, synchronization, 3D reconstruction, and foreground detection. 
    								
    								| System Requirement
									Key Functionality: Raw images with (x, y, z, t, spectral) information;  
									Knowledge about interactions of cloned objects and virtual objects; 
									Interaction and feedback in real time.Portability: Re-configurable hardware, easy synchronization and calibration.(Low) Cost: < $50K in comparison to the commercial solutions: > $0.5M.Robustness: Invariance to environment variables, scalable with the number of hardware 
									components and the computational resources, adaptable to LAN and web networking resources, real-time performance. | Approach
									stereo vision, time-of-flight ranging, multi-spectral imaging, advanced analyses of scene measurements, 
									analyses of cloned and virtual interactions, multi-sensory feedback to humans.Portable mounting of hardware (tripods and carts), automated parameter configurationCOTS components, open source software, web 2.0multi-spectral sensing, scalable algorithms for 3D reconstruction (stereo vision) and rendering 
									(fusion of multiple streams) to accommodate a variable number of cores on PCs, distributed system operation 
									using TCP/IP and UDP protocols over existing networks, data compression, object recognition. |  Data volume and networking: Huge bandwidth requirements for the uncompressed data stream:One 3D stream requires approximately 23MBytes/sec (640x480 [pixel/frame] x 5 [bytes/pixel] x 15 [frames/sec] = 23,040,000 [bytes/second])
 One TI (10 3D Streams) requires approximately 230MBytes/sec. Also the streams have to be fused with different time latencies.
 
							  Figure 1: A sketch of our experimental setup (physical space). Approximate room size 
							is 15ft x 15ft System operation:
							Session controller registers gateways.Gateway machine manages its local cameras and displays, and relays the content to other 
							gateways.Trigger server synchronizes cameras. It will send a message to its gateway with the number 
							of cameras to create a new session.Camera machine performs all computation of four video streams and sends the content to 
							the gateway on the port assigned by the gateway. After the exact number of cameras has joined 
							the session, the gateway will make the content available to local renderers and and/or other 
							gateways.Renderer (or Display) machine runs a small daemon code that maintains the connection with 
							the gateway and performs the fusion of all local and remote content. Vision algorithms that perform 3D reconstruction primarily rely on knowledge of the
							camera positions and orientations with respect to some reference frame. Camera calibration is the process of 
							determining the geometric and optical parameters which describe the transformation from an object in 
							the world to it’s image detected by the camera system. These parameters are usually grouped into 
							intrinsic 
							and extrinsic parameters. Intrinsic parameters describe quantities that are effected by the optical and 
							electrical components of a camera: (a) focal length, (b) pixel aspect ratio, (c) principle
							point, (d) lens distortion, etc. Extrinsic parameters describe the geometric position and
							orientation of the camera with respect to the world. 
							  Figure 2: Image depicts two trinocular stereo cluster
							and one thermal infrared camera. The stereo cluster contains three grayscale and one
							color Dragonfly digital camera. The thermal camera is
							an uncooled microbolometer, which detects thermal energy in the
							LWIR (7.5 to 13.5 microns) wavelengths. Our calibration method is based on technique of 
							Tomáš Svoboda and co-workers 
							
							[pdf 1.7MB] with calibration time about 1-3 hours. Our goal is to
							automate calibration of intrinsic parameters while enabling manual calibration of extrinsic parameters. High frame rate is required for motion capture (basketball bouncing). Our frame rate is 20-24 fps on quad core machine compared to 
							the original 2-5 fps on single and 12 fps on dual core machine. Integration of thermal and visible imagery for robust foreground detection in tele-immersive spacesThe central objects of interest in a tele-immersive system are these people, the things they jointly manipulate, 
							and the tools they need to perform this manipulation. Certain assumptions about the image in order to reduce the complexity of the
							detection problem make it easier to find objects of interest. Typical assumptions are: background materials have non-reflective surfaces,
							intensity differential between background and foreground objects, scene illumination is constant, foreground object is uniformly illuminated 
							by diffuse lighting, scene lighting exhibits a constant power spectrum. Unfortunately, these assumptions often break down in practice. In particular, five characteristics of real scenes
							cause problems in the current TEEVE system: changing illumination, moving foreground objects (causing shadows), 
							moving background objects, lack of contrast between foreground and background objects, and lack of contrast between different
							foreground objects. 
							
							 
							A method of fusing information from visible and thermal infrared cameras 
							can solve all five of these problems  Figure 3: Problems in object detection:
							Top images are the current frames, below are static backgrounds taken before image acquisition, 
							current minus static background is the third from top and finally bottom images show the foreground detection.
							(a) Changing illumination: In this sequence, a small lamp was turned on before 
							(current) acquisition, causing large portions of the background to be seen as foreground.
							(b) Moving foreground objects: Moving foregrounds cast shadows
							and change the general scattering environment of the scene. In this sequence, background objects are mistaken as foreground.
							(c) Moving background objects: The computer display (a background object) has changed 
							appearance between the acquisition of the background and the current frame. Thus, the current system treats these changed 
							pixels as foreground.
							(d) Low contrast between foreground and background: The visible modality has a difficulty 
							classifying foreground objects that have intensities or colors (dark shirt here) that closely match the background model.
							(e) Low contrast between different foreground objects: The current TEEVE system
							cannot distinguish between the object of interest (the ball), and a rectangular object, thus classifying them both as foreground. .
							 Visual and thermal cameras provide fundamentally different information. Where visual
							cameras primarily measure how materials reflect light, thermal cameras primarily
							measure temperature. These differences of content mean that a combination of visual
							and infrared images can provide more information about a scene than either modality
							used alone. 
							    Figure 4: Left. TEEVE system based on visible spectrum imaging only. Right. Tele-immersive
							 system based on visible and thermal IR spectrum imaging and information fusion. There are three types of benefits that IR imaging can provide tele-immersive systems:
							(1) IR can enhance image processing tasks at a low level (e.g. human foreground
							detection), (2) IR can allow tele-immersion users to perceive temperature in the virtual
							environment using visual or tactile feedback, and (3) IR can fundamentally enhance
							material and object classification. Final ResultsFinally, figure 6 shows the combined results. In the changing illumination experimental results 
							(figure 3a and 5 left) 
							the thermal imagery is not sensitive to the lighitng change and is able to detect the person in the scene. A
							lso, because our inanimate object detection emphasizes higher level features, in this case shape, the ball is 
							correctly identified as an object of interest.In the moving foreground object experimental results (figure 3b and 5 middle) shadows are cast on the
							background due to the foreground object's motion in the scene. Room lighting remained constant
							during this experiment, and the change of background pixel intensity is solely due to the human
					 		foreground blocking illumination. This experiment illustrates that visible and thermal fusion is
							robust to shadowing on textured backgrounds.
 Low contrast between foreground/background experimental results: This experiment
							demonstrates our fusion algorithm performance in the presence of problems that can occur
							when the foreground object has a similar color or brightness as the background (described in figure 3d and 5 right). 
							In this case, the current system fails to recognize portions of the person as foreground
							because of their dark clothing. Fusion with thermal information is able to fill in the missing
							information.
 
							  Figure 6: Top: Changing illumination results. Three top pictures show 
							our results of visible and IR fusion (compared to the existing system performance) 
							in the presence of the Changing illumination problem described in figures 3a and 5 left).
							Middle: Results in the presence of the Moving foreground object results (figures 3b and 5 middle).
							Bottom: Improved detection of the foreground object with the thermal information. (figures 3d and 5 right) 
							
								
                 					| Experiment | Method | F Neg | F Pos | Total | % err |  
                  					| Changing illumination | TEEVE Fusion
 | 141 566
 | 21939 420
 | 22080 986
 | 57 2
 |  
                  					| Moving foregrounds casting shadows | TEEVE Fusion
 | 0 244
 | 10674 876
 | 10674 1120
 | 28 3
 |  
                  					| Moving background | TEEVE Fusion
 | 122 117
 | 2205 424
 | 2327 541
 | 6 1
 |  
                 					| Low contrast between foreground and background | TEEVE Fusion
 | 6056 741
 | 209 647
 | 6265 1344
 | 16 4
 |  
                 					| Low contrast between different foreground objects | TEEVE Fusion
 | 74 206
 | 1127 1010
 | 1201 1216
 | 3 3
 |  Table of quantitative results, comparing performance of current tele-immersive system 
							and our proposed fusion algorithm. "F Neg" represents the number of pixels that were 
							incorrectly classified as background (i.e. the false negative detections). "F Pos" 
							represents the number of pixels that were incorrectly classified as foreground 
							(i.e. the false positive detections). The "Total" is the sum of these two pixel counts, 
							and the percent error represents the percentage of the image that was misclassified. Summary
							Calibration of visible and infrared cameras: we extended a state of the art automatic
							multi-camera calibration technique to simultaneously calibrate grayscale,
							color, and thermal cameras.Development of methodology for fusing visible and infrared images based on
							tele-immersive system scene modeling and estimation of scene 3D structure.Building prototype hardware to acquire visible and thermal IR imagery, and the
							design of off-line processing and analysis algorithms.Quantitative analysis of the tele-immersive system with and without fusion of visible
							and infrared information. Figure below presents an example of a problem related to
							foreground detection approached by using the fusion of Thermal Infrared (IR)
							and Visible images. Through exploring the fusion of multiple sensor
							modalities in the context of tele-immersive systems, we can enhance
							computational efficiency, user immersive experience, and automatic scene
							understanding. 
							  Figure 7: This set of images (from a single time step)
							demonstrates a particular challenging scenario that involves a dynamic scene in
							both the visible and thermal wavelengths. Top row: visible
							wavelength background image; visible frame, difference between current visible
							frame and background. Bottom row: thermal background, current
							thermal frame, simple thresholding and connected components in thermal
							frame. This scene contains a monitor, which is showing a dynamic
							video, and a warm cup of water which is cooling over time. Note that
							the monitor can also change temperature over time if it is turned off, or
							hibernating. In this challenging case, model-based classification will be
							able to tell the difference between the human subject and other objects. People, Publications, Presentations 	
   					
                			
				 			Team members
  								Professor Peter BajcsyResearch group ISDA, National Center for Supercomputing 
								Applications, UIUC
Professor Ruzena BajcsyBerkeley Center for Information Technology Research
								 in the Interest of Society (CITRIS)
Yi MaElectrical and Computer Engineering , UIUC
Mike FrogleyHead Coach, Men's and Women's Basketball, Disability Services 
								at the University of Illinois (DRES), UIUC
Brad HedrickDRES, UIUC
Professor Kenneth WatkinDepartment of Speech and Hearing Sciences, UIUC, UIUC
Professor Claire TomlinElectrical Engineering and Computer Sciences, UC Berkeley
Professor Richard IvryCognition and Action Lab, UC Berkeley
Professor Robert Gotsch
Professor Klara NahrstedtResearch group MONET, 
								Computer Science Department, UIUC
Rob KooperISDA, National Center for Supercomputing 
								Applications, UIUC
Gregorij KurilloCITRIS 
								tele-immersion, UC Berkeley
Kenton McHenryISDA, NCSA, UIUC
Rahul MalikISDA, Computer Science Department, UIUC
Hye Jung NaISDA, Computer Science Department, UIUC
 Former members
								Miles JohnsonISDA, Aerospace Department, UIUC
Suk Kyu LeeISDA, Computer Science Department, UIUC
 FundingThe funding was provided by National Science Foundation IIS 07-03756 grant 490630 and NCSA core grant. Publications
                                 R. Malik and P. Bajcsy, 
								"Achieving Color Constancy Across Multiple Cameras.", 
								ACM International Conference on Multimedia, Beijing, China, October 19 - 24, 2009 (~ 30% acceptance)							
								P. Bajcsy, K. McHenry, H.-J. Na, R. Malik, A. Spencer, S.-K. Lee, R. Kooper, and M. Frogley, 
								"Immersive Environments For Rehabilitation Activities.", 
								ACM International Conference on Multimedia, Beijing, China, October 19 - 24, 2009 (~ 27.5% acceptance)						
								K. McHenry and P. Bajcsy, 
								"Key Aspects in 3D File Format Conversions.", 
								Joint Annual Meeting of the Society of American Archivists and the Council of State Archivists, 2009 Research Forum 'Foundations and Innovations', August 11, Hilton Austin, Texas, USA, 2009
                                [proceedings]							
								S-K. Lee, K. McHenry, R. Kooper and P. Bajcsy, 
								"Characterizing Human Subjects In Real-Time And Three-Dimensional Spaces By Integrating Thermal-Infrared And Visible Spectrum Cameras.", 
								Workshop on Multimedia Aspects in Pervasive Healthcare., in conjunction with 2009 IEEE International Conference on Multimedia & Expo (ICME), July 3, 2009, New York, NY, USA						
								P. Bajcsy, M. Frogley, R. Kooper, S-K. Lee, R. Malik, K. McHenry, H-J. Na, and A. Spencer, 
								"Design And Use Of Immersive Environments For Regaining Proprioceptive Abilities.", 
								Workshop on Multimedia Aspects in Pervasive Healthcare., in conjunction with 2009 IEEE International Conference on Multimedia & Expo (ICME), July 3, 2009, New York, NY, USA							
								R. Malik and P. Bajcsy, 
								"Optimal Stereo Camera Placement Under Spatially Varying Resolution Requirements.", 
								2nd International Conference on Immersive Telecommunications., University of California, Berkeley, CA, USA,  May 27-29, 2009 (accepted ~50% acceptance rate)							
								A. Spencer, H Jung, K. McHenry, H-J. Na, R. Malik, S-K. Lee, R. Kooper, P. Bajcsy, 
								"Tele-Immersive Environments For Everybody.", 
								poster at PRAGMA 16, KISTI, Daejon Convention Center, Korea, March  23-24, 2009								
								Miles Johnson, 
								"Integration of thermal and visible imagery for robust foreground detection in Tele-immersive spaces.", 
								Thesis for the degree of Master in Science in Aerospace Engineering, Graduate College of the University of Illinois at Urbana-Champaign, 2007
                                [pdf 4.6MB]								
								 |  |  |