会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Apparatus and method of repairing a processor array for a failure detected at runtime
    • 修复在运行时检测到的故障的处理器阵列的装置和方法
    • US06851071B2
    • 2005-02-01
    • US09974967
    • 2001-10-11
    • Douglas Craig BossenDaniel James HendersonRaymond Leslie HicksAlongkorn KitamornDavid Otto LewisThomas Alan Liebsch
    • Douglas Craig BossenDaniel James HendersonRaymond Leslie HicksAlongkorn KitamornDavid Otto LewisThomas Alan Liebsch
    • G06F11/00G06F11/10G06F11/14G06F11/16
    • G06F11/1064G06F11/076G06F11/0772G06F11/079G06F11/0793G06F11/142G06F11/1425G06F2201/81G11C2029/0401G11C2029/0409
    • An apparatus and method of repairing a processor array for a failure detected at runtime in a system supporting persistent component deallocation are provided. The apparatus and method of the present invention allow redundant array bits to be used for recoverable faults detected in arrays during run time, instead of only at system boot, while still maintaining the dynamic and persistent processor deallocation features of the computing system. With the apparatus and method of the present invention, a failure of a cache array is detected and a determination is made as to whether a repairable failure threshold is exceeded during runtime. If this threshold is exceeded, a determination is made as to whether cache array redundancy may be applied to correct the failure, i.e. a bit error. If so, the cache array redundancy is applied without marking the processor as unavailable. At some time later, the system undergoes a re-initial program load (re-IPL) at which time it is determined whether a second failure of the processor occurs. If a second failure occurs, a determination is made as to whether any status bits are set for arrays other than the cache array that experienced the present failure, if so, the processor is marked unavailable. If not, a determination is made as to whether cache redundancy can be applied to correct the failure. If so, the failure is corrected using the cache redundancy. If not, the processor is marked unavailable.
    • 提供了一种用于在支持持久性组件分配的系统中在运行时检测到的故障的处理器阵列的修复的装置和方法。 本发明的装置和方法允许冗余阵列位用于在运行时间期间在阵列中检测到的可恢复故障,而不是仅在系统引导时,同时仍维持计算系统的动态和持久处理器释放特征。 利用本发明的装置和方法,检测到高速缓存阵列的故障,并且确定在运行时期间是否超过了可修复的故障阈值。 如果超过该阈值,则确定是否应用高速缓存阵列冗余来校正故障,即位错误。 如果是这样,则应用缓存阵列冗余,而不会将处理器标记为不可用。 在稍后的一段时间内,系统经历重新启动程序加载(re-IPL),此时确定处理器是否发生第二个故障。 如果发生第二个故障,则确定是否为经历当前故障的高速缓存阵列之外的阵列设置了任何状态位,否则,处理器被标记为不可用。 如果不是,则确定是否可以应用高速缓存冗余来校正故障。 如果是这样,则使用高速缓存冗余来校正故障。 如果没有,则处理器被标记为不可用。
    • 3. 发明授权
    • Method and system for boot-time deconfiguration of a processor in a symmetrical multi-processing system
    • 用于对称多处理系统中处理器引导时解体的方法和系统
    • US06233680B1
    • 2001-05-15
    • US09165952
    • 1998-10-02
    • Douglas Craig BossenAlongkorn KitamornCharles Andrew McLaughlin
    • Douglas Craig BossenAlongkorn KitamornCharles Andrew McLaughlin
    • G06F15177
    • G06F11/0772G06F11/0724G06F11/079G06F11/0793G06F15/177
    • A method and system for deconfiguring a CPU in a processing system is disclosed. In one aspect, a processing system is disclosed that comprises a central processing unit (CPU), and a memory coupled to the CPU. The error status register for capturing information concerning the status of the CPU. The processing system includes a service processor for gathering and analyzing status information from the CPU error register. The processing system also includes a nonvolatile device coupled to the service processor. The nonvolatile device includes a deconfiguration area. The deconfiguration area stores information concerning the status of the CPU from the service processor. The deconfiguration area also provides information for deconfiguring a CPU during a boot time of the processing system. Accordingly, through the present invention, CPU errors are detected during normal computer operations by error detection logic. This detection is utilized during any subsequent boot process by service processor firmware to deallocate the defective CPU. This is accomplished through the use of error status registers within the CPU and through the use of a deconfiguration area in the nonvolatile device which provides information directly to the service processor.
    • 公开了一种用于在处理系统中对CPU进行解配置的方法和系统。 在一个方面,公开了一种包括中央处理单元(CPU)和耦合到CPU的存储器的处理系统。 用于捕获有关CPU状态的信息的错误状态寄存器。 处理系统包括用于从CPU错误寄存器收集和分析状态信息的服务处理器。 处理系统还包括耦合到服务处理器的非易失性设备。 非易失性器件包括解配置区域。 解除配置区域从服务处理器存储关于CPU的状态的信息。 解除配置区域还提供了在处理系统的引导时间期间对CPU进行解除配置的信息。 因此,通过本发明,通过错误检测逻辑在通常的计算机操作期间检测到CPU错误。 这种检测在服务处理器固件的任何后续启动过程中被利用以释放有缺陷的CPU。 这是通过使用CPU内的错误状态寄存器并通过使用非易失性设备中的解配置区来实现的,该非配置区域直接向服务处理器提供信息。
    • 5. 发明授权
    • Method and system for boot-time deconfiguration of a memory in a processing system
    • 用于处理系统中存储器引导时解配置的方法和系统
    • US06243823B1
    • 2001-06-05
    • US09165955
    • 1998-10-02
    • Douglas Craig BossenAlongkorn KitamornCharles Andrew McLaughlin
    • Douglas Craig BossenAlongkorn KitamornCharles Andrew McLaughlin
    • G06F15177
    • G06F11/142
    • A method and system for deconfiguring software in a processing system is disclosed. In one aspect, a processing system comprises a central processing unit (CPU), and a memory coupled to the CPU. The memory includes a memory array and a memory controller for capturing information concerning the status of the memory array. The processing system includes a service processor for gathering and analyzing status information from the memory controller. The processing system also includes a nonvolatile device coupled to the CPU and the service processor. The nonvolatile device includes a deconfiguration area. The deconfiguration area stores information concerning the status of the memory array from the service processor. The deconfiguration area also provides information for deconfiguring at least a portion of the memory array during a boot time of the processing system. Accordingly, through the present invention, memory errors are detected during normal computer operations by error detection logic. This detection is utilized during any subsequent boot process by service processor and CPU boot firmware to deallocate the defective memory module. This is accomplished through the use of error status registers within the memory controller and through the use of a deconfiguration area in the nonvolatile device which provides information directly to the CPU boot firmware.
    • 公开了一种在处理系统中解除配置软件的方法和系统。 在一个方面,处理系统包括中央处理单元(CPU)和耦合到CPU的存储器。 存储器包括存储器阵列和用于捕获关于存储器阵列的状态的信息的存储器控​​制器。 处理系统包括用于从存储器控制器收集和分析状态信息的服务处理器。 处理系统还包括耦合到CPU和服务处理器的非易失性设备。 非易失性器件包括解配置区域。 解除配置区域从服务处理器存储关于存储器阵列的状态的信息。 解配置区域还提供用于在处理系统的引导时间期间解除配置存储器阵列的至少一部分的信息。 因此,通过本发明,通过错误检测逻辑在正常的计算机操作期间检测存储器错误。 在任何后续引导过程中,服务处理器和CPU引导固件都会使用该检测来取消分配有缺陷的内存模块。 这是通过使用存储器控制器内的错误状态寄存器并且通过使用非易失性设备中的解除配置区域来实现的,该非配置区域直接向CPU引导固件提供信息。
    • 8. 发明授权
    • Enhanced error handling for I/O load/store operations to a PCI device via bad parity or zero byte enables
    • 通过坏的奇偶校验或零字节使I / O加载/存储操作到PCI设备的增强的错误处理能够实现
    • US06223299B1
    • 2001-04-24
    • US09072418
    • 1998-05-04
    • Douglas Craig BossenCharles Andrew McLaughlinDanny Marvin NealJames Otto NicholsonSteven Mark Thurber
    • Douglas Craig BossenCharles Andrew McLaughlinDanny Marvin NealJames Otto NicholsonSteven Mark Thurber
    • G06F1100
    • G06F11/0772G06F11/0745G06F11/0793
    • Device selects lines from each I/O device are brought into a PCI host bridge individually so that the device number of a failing device may be logged in an error register when an error is seen on the PCI bus. Until the error register is reset, subsequent load and store operations are delayed until the device number of the subject device may be checked against the error register. If the subject device is a previously failing device, the load/store operation to that device is prevented from completing, either by forcing bad parity or zeroing all byte enables. By forcing bad parity of zero byte enables, the I/O device will respond to the load or store request by activating its device select line, but will not accept store data. Operations to devices which are not logged in the error register are permitted to proceed normally, as are all load store operations when the error register is clear. Normal system operations are thus not impacted, and operations during error recovery are permitted to proceed if no further damage will be caused by such operations.
    • 设备选择每个I / O设备的线路分别插入PCI主机桥,以便在PCI总线上出现错误时,可能会将故障设备的设备号记录在错误寄存器中。 在错误寄存器复位之前,后续的加载和存储操作将被延迟,直到可以针对错误寄存器检查主体设备的设备编号。 如果主机设备是先前发生故障的设备,则通过强制坏的奇偶校验或归零所有字节使能来防止对该设备的加载/存储操作完成。 通过强制零字节的不良奇偶使能,I / O设备将通过激活其设备选择行来响应加载或存储请求,但不接受存储数据。 允许对未登录在错误寄存器中的设备进行操作,正常情况下,正常情况下进行加载存储操作。 因此,正常的系统操作不会受到影响,并且如果这种操作不会造成进一步的损坏,则允许错误恢复期间的操作进行。
    • 10. 发明授权
    • Method and system for end-to-end problem determination and fault isolation for storage area networks
    • 存储区域网络的端到端问题确定和故障隔离的方法和系统
    • US06636981B1
    • 2003-10-21
    • US09478306
    • 2000-01-06
    • Barry Stanley BarnettDouglas Craig Bossen
    • Barry Stanley BarnettDouglas Craig Bossen
    • G06F15177
    • G06F11/0781G06F11/0727H04L41/022H04L41/0609H04L41/064H04L41/065
    • A method and system for problem determination and fault isolation in a storage area network (SAN) is provided. A complex configuration of multi-vendor host systems, FC switches, and storage peripherals are connected in a SAN via a communications architecture (CA). A communications architecture element (CAE) is a network-connected device that has successfully registered with a communications architecture manager (CAM) on a host computer via a network service protocol, and the CAM contains problem determination (PD) functionality for the SAN and maintains a SAN PD information table (SPDIT). The CA comprises all network-connected elements capable of communicating information stored in the SPDIT. The CAM uses a SAN topology map and the SPDIT are used to create a SAN diagnostic table (SDT). A failing component in a particular device may generate errors that cause devices along the same network connection path to generate errors. As the CAM receives error packets or error messages, the errors are stored in the SDT, and each error is analyzed by temporally and spatially comparing the error with other errors in the SDT. If a CAE is determined to be a candidate for generating the error, then the CAE is reported for replacement if possible.
    • 提供了一种用于存储区域网络(SAN)中的问题确定和故障隔离的方法和系统。 多厂商主机系统,FC交换机和存储外设的复杂配置通过通信架构(CA)连接在SAN中。 通信体系结构元件(CAE)是一种网络连接的设备,其已经通过网络服务协议成功地与主计算机上的通信架构管理器(CAM)注册,并且CAM包含用于SAN的问题确定(PD)功能并且维护 SAN PD信息表(SPDIT)。 CA包括能够传送存储在SPDIT中的信息的所有网络连接元件。 CAM使用SAN拓扑图,SPDIT用于创建SAN诊断表(SDT)。 特定设备中的故障组件可能会产生错误,导致沿同一网络连接路径的设备产生错误。 当CAM接收到错误包或错误消息时,将错误存储在SDT中,并通过对错误与SDT中的其他错误进行时间和空间的比较来分析每个错误。 如果确定CAE是生成错误的候选者,则如果可能,报告CAE进行更换。