Ticket #379 (closed defect: fixed)

Opened 5 years ago

Last modified 5 years ago

Exo-Layer recovery with code update Oct 14/2014

Reported by: ibaldin Owned by: yxin
Priority: major Milestone:
Component: External: Testing and Redeployment Version: baseline
Keywords: Cc: vjo, jonmills

Description (last modified by ibaldin) (diff)

THis is mostly a historical record on how to do recovery. Here is what I did to recover ExoLayer? and 4 racks

  1. Stop controllers (RCI, WVN, TAMU, UCD, Exo)
  2. Check for ticketed reservations on all SMs (closed ticketed stuck reservations fist on respective AMs then on SMs [the only way it works])
  3. Update configuration, code on RCI and restart AM+BROKER and SM
  4. Update configuration, code on WVN and restart AM+BROKER and SM
  5. Update configuration, code on UCD and restart AM+BROKER and SM
  6. Update configuration, code on TAMU and restart AM+BROKER and SM
  7. Update configuration, code on geni2/ION+DD and restart AM+BROKER
  8. Update configuration, code on geni-ben/BEN and restart AM+BROKER
  9. Update configuration, code on geni
  10. restart NDL-BROKER
  11. restart EXO-SM
  12. restart EXO-CONTROLLER
  13. restart other controllers

Change History

Changed 5 years ago by ibaldin

  • description modified (diff)

Encountered an exception we've seen before. Something is not quite right with recovering slices that probably have been extended multiple times.

java.lang.NullPointerException?
INFO | jvm 1 | 2014/10/14 16:22:44 | at java.util.Calendar.setTime(Calendar.java:1106)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.ndl.elements.OrcaReservationTerm?._setEnd(OrcaReservationTerm?.java:59)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.ndl.elements.OrcaReservationTerm?.setStart(OrcaReservationTerm?.java:83)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.embed.workflow.RequestWorkflow?.recover(RequestWorkflow?.java:350)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.embed.workflow.RequestWorkflow?.recover(RequestWorkflow?.java:316)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlrpcControllerSlice?.recover(XmlrpcControllerSlice?.java:464)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlrpcOrcaState?.recoverSlice(XmlrpcOrcaState?.java:452)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlrpcOrcaState?.recover(XmlrpcOrcaState?.java:413)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlRpcController?._recover(XmlRpcController?.java:196)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.OrcaController?.recover(OrcaController?.java:62)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlRpcController?.access$100(XmlRpcController?.java:24)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlRpcController?$ControllerContextListener?.start(XmlRpcController?.java:174)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlRpcController?.start(XmlRpcController?.java:156)
INFO | jvm 1 | 2014/10/14 16:22:44 | at orca.controllers.xmlrpc.XmlRpcController?.main(XmlRpcController?.java:201)
INFO | jvm 1 | 2014/10/14 16:22:44 | at sun.reflect.NativeMethodAccessorImpl?.invoke0(Native Method)
INFO | jvm 1 | 2014/10/14 16:22:44 | at sun.reflect.NativeMethodAccessorImpl?.invoke(NativeMethodAccessorImpl?.java:57)
INFO | jvm 1 | 2014/10/14 16:22:44 | at sun.reflect.DelegatingMethodAccessorImpl?.invoke(DelegatingMethodAccessorImpl?.java:43)
INFO | jvm 1 | 2014/10/14 16:22:44 | at java.lang.reflect.Method.invoke(Method.java:606)
INFO | jvm 1 | 2014/10/14 16:22:44 | at org.tanukisoftware.wrapper.WrapperSimpleApp?.run(WrapperSimpleApp?.java:240)
INFO | jvm 1 | 2014/10/14 16:22:44 | at java.lang.Thread.run(Thread.java:745)
INFO | jvm 1 | 2014/10/14 16:22:44 | 2014-10-14 16:22:44,900 [WrapperSimpleAppMain?] ERROR controller.orca.controllers.xmlrpc.XmlrpcOrcaState? - Unable to recover slice b97052ca-b908-4955-b581-8e5110e3b5ef/adamant due to: java.lang.NullPointerException?

Changed 5 years ago by ibaldin

  • owner changed from ibaldin to yxin
  • component changed from Don't Know to External: Testing and Redeployment

Changed 5 years ago by ibaldin

  • status changed from new to closed
  • resolution set to fixed

Another issue was a complaint from Ezra that his slice that he thought he had extended expired and went away. This needs testing. This issue is documented in #380, this ticket is closed.

Note: See TracTickets for help on using tickets.