Ticket #366 (closed defect: fixed)

Opened 5 years ago

Last modified 5 years ago

Interfaces issue for (vm+ baremetal) slice with multiple storage

Reported by: anirban Owned by: yxin
Priority: major Milestone:
Component: Don't Know Version: baseline
Keywords: Cc: ibaldin, anirban, pruth, yxin

Description

For a slice with vms, baremetals and storage attached to each (request attached), when launched at UCD, the dataplane interfaces fail to come up for vms and baremetal nodes.

Storage is correctly mounted on the vm and the baremetal node.

Here's output of ifconfig for baremetal node with storage, which is missing the dataplane interface, but has the storage interface, p2p1.1009 with IP address 10.104.0.5

[root@ucd-w9 ~]# ifconfig -a
em2 Link encap:Ethernet HWaddr 40:F2:E9:26:52:6C

inet addr:10.101.0.19 Bcast:10.101.0.255 Mask:255.255.255.0
inet6 addr: fe80::42f2:e9ff:fe26:526c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:477971 errors:0 dropped:0 overruns:0 frame:0
TX packets:45785 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:716434678 (683.2 MiB) TX bytes:3471965 (3.3 MiB)
Memory:c45c0000-c45e0000

em5 Link encap:Ethernet HWaddr 40:F2:E9:26:52:6D

BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Memory:c45e0000-c4600000

lo Link encap:Local Loopback

inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

p2p1 Link encap:Ethernet HWaddr 00:07:43:14:8F:80

inet6 addr: fe80::207:43ff:fe14:8f80/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:87864 errors:0 dropped:0 overruns:0 frame:0
TX packets:168535 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:10397674 (9.9 MiB) TX bytes:249127827 (237.5 MiB)
Interrupt:32

p2p1.1009 Link encap:Ethernet HWaddr 00:07:43:14:8F:80

inet addr:10.104.0.5 Bcast:10.104.0.255 Mask:255.255.255.0
inet6 addr: fe80::207:43ff:fe14:8f80/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:87604 errors:0 dropped:0 overruns:0 frame:0
TX packets:20359 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:8418126 (8.0 MiB) TX bytes:237990246 (226.9 MiB)

p2p2 Link encap:Ethernet HWaddr 00:07:43:14:8F:88

inet6 addr: fe80::7:4300:114:8f88/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5655 errors:0 dropped:0 overruns:0 frame:0
TX packets:266 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:384540 (375.5 KiB) TX bytes:29355 (28.6 KiB)
Interrupt:32

usb0 Link encap:Ethernet HWaddr 42:F2:E9:26:52:69

BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

================

Here's the output of ifconfig for the vm with storage. It is missing dataplane interface, but has the storage interface with IP 10.104.0.6 (different from the one with baremetal)

root@Node2:~# ifconfig -a
eth0 Link encap:Ethernet HWaddr fa:16:3e:25:11:45

inet addr:10.103.0.18 Bcast:10.103.0.255 Mask:255.255.255.0
inet6 addr: fe80::f816:3eff:fe25:1145/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6976 errors:0 dropped:0 overruns:0 frame:0
TX packets:6577 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1950669 (1.8 MiB) TX bytes:679633 (663.7 KiB)
Interrupt:11 Base address:0xa000

eth1 Link encap:Ethernet HWaddr fe:16:3e:00:10:78

inet addr:10.104.0.6 Bcast:10.104.0.255 Mask:255.255.255.0
inet6 addr: fe80::fc16:3eff:fe00:1078/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:89213 errors:0 dropped:0 overruns:0 frame:0
TX packets:38254 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8123642 (7.7 MiB) TX bytes:239180704 (228.1 MiB)

lo Link encap:Local Loopback

inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

The corresponding output of neuca-user-data:

root@Node2:~# neuca-user-data
[global]
actor_id=b0e0e413-77ef-4775-b476-040ca8377d1d
slice_id=496d8616-c2ec-4f36-ae91-12419e240556
reservation_id=f56658ec-da0c-4ffe-b6f2-d20ad33af2c1
unit_id=c3d8166a-0b82-473a-a1bf-77aff40f1a54
;router= Not Specified
iscsi_initiator_iqn=iqn.2012-02.net.exogeni:d7941e9c-0bab-4bf3-a306-86dab7344f45
slice_name=vm-bm-storage-ucd-2
unit_url=http://geni-orca.renci.org/owl/09665f1f-047e-45a2-9dc6-12073c2ff492#Node2
host_name=Node2
management_ip=128.120.83.21
physical_host=ucd-w6
nova_id=65c93e3d-ca4c-4024-bc57-542295d813fe
[users]
root=no:ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtRKW2Tny87MJvyJXq+Pr+MUXf9U9/h0VO1fao5U/TfPMzBqEvqmkvmrSuodEnQcr/nqK9yzJmU+famDe47hMrbjVKxfpX3LasbYRYIVmKoLT5B5JZv1QO2m/VeKuN5eJySB6YsX0NAijNf14pP1cBT7lfCwm7tk7q4WLcmg9Xi9wy05NV+zhi8aaqR3hHx+56YI6O/f+WW6iqOV6jtWMp1tViWc6GeDWJtPAX+XU+j0vgNhzHuin7hh4oLi/rof3B8wJFBv/RNE2KVJa6lrbIGsUYdhXWZyjz0rKhvo85F8vueLfTp0MVWrMPszBq32bzRdtpNvN199m6Qyq9kpTXQ== anirban@…:ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtRKW2Tny87MJvyJXq+Pr+MUXf9U9/h0VO1fao5U/TfPMzBqEvqmkvmrSuodEnQcr/nqK9yzJmU+famDe47hMrbjVKxfpX3LasbYRYIVmKoLT5B5JZv1QO2m/VeKuN5eJySB6YsX0NAijNf14pP1cBT7lfCwm7tk7q4WLcmg9Xi9wy05NV+zhi8aaqR3hHx+56YI6O/f+WW6iqOV6jtWMp1tViWc6GeDWJtPAX+XU+j0vgNhzHuin7hh4oLi/rof3B8wJFBv/RNE2KVJa6lrbIGsUYdhXWZyjz0rKhvo85F8vueLfTp0MVWrMPszBq32bzRdtpNvN199m6Qyq9kpTXQ== anirban@…:
[interfaces]
fe163e00524f=up:ipv4:172.16.100.2/25
fe163e001078=up:ipv4:10.104.0.6/24
[storage]
dev0=iscsi:10.104.0.2:3260:1:1e7jpea:1rrfotffsfibhnagng330hap4v:yes:ext4:-F -b 2048:yes:/mnt/target
[routes]
[scripts]

===================

The other vms that are not attached to the storage also don't get dataplane interfaces. Here's output of neuca-user-data for them:

root@NodeGroup?0-1:~# neuca-user-data
[global]
actor_id=b0e0e413-77ef-4775-b476-040ca8377d1d
slice_id=496d8616-c2ec-4f36-ae91-12419e240556
reservation_id=87a6b9b1-41a8-4715-b17a-c967795a040e
unit_id=296439a5-68a0-402f-a1f8-66e4bf7d6adc
;router= Not Specified
;iscsi_initiator_iqn= Not Specified
slice_name=vm-bm-storage-ucd-2
unit_url=http://geni-orca.renci.org/owl/09665f1f-047e-45a2-9dc6-12073c2ff492#NodeGroup0/1
host_name=NodeGroup?0-1
management_ip=128.120.83.5
physical_host=ucd-w8
nova_id=d576e7fa-4b9d-4116-99b8-da05af0e0b06
[users]
root=no:ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtRKW2Tny87MJvyJXq+Pr+MUXf9U9/h0VO1fao5U/TfPMzBqEvqmkvmrSuodEnQcr/nqK9yzJmU+famDe47hMrbjVKxfpX3LasbYRYIVmKoLT5B5JZv1QO2m/VeKuN5eJySB6YsX0NAijNf14pP1cBT7lfCwm7tk7q4WLcmg9Xi9wy05NV+zhi8aaqR3hHx+56YI6O/f+WW6iqOV6jtWMp1tViWc6GeDWJtPAX+XU+j0vgNhzHuin7hh4oLi/rof3B8wJFBv/RNE2KVJa6lrbIGsUYdhXWZyjz0rKhvo85F8vueLfTp0MVWrMPszBq32bzRdtpNvN199m6Qyq9kpTXQ== anirban@…:ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtRKW2Tny87MJvyJXq+Pr+MUXf9U9/h0VO1fao5U/TfPMzBqEvqmkvmrSuodEnQcr/nqK9yzJmU+famDe47hMrbjVKxfpX3LasbYRYIVmKoLT5B5JZv1QO2m/VeKuN5eJySB6YsX0NAijNf14pP1cBT7lfCwm7tk7q4WLcmg9Xi9wy05NV+zhi8aaqR3hHx+56YI6O/f+WW6iqOV6jtWMp1tViWc6GeDWJtPAX+XU+j0vgNhzHuin7hh4oLi/rof3B8wJFBv/RNE2KVJa6lrbIGsUYdhXWZyjz0rKhvo85F8vueLfTp0MVWrMPszBq32bzRdtpNvN199m6Qyq9kpTXQ== anirban@…:
[interfaces]
fe163e006228=up:ipv4:172.16.100.5/25
[storage]
[routes]
[scripts]

======================

The other baremetal node also doesn't get dataplane IP.

======================

Here's some log info from handler-vm.log wrt QUANTUM_NET_NETWORK for the time period I was running this slice. The time period might not be entirely accurate, but anyway..

2014-09-18 21:54:27,656 -- neuca-quantum-add-iface 18087 DEBUG : QUANTUM_NET_NETWORK: data
2014-09-18 21:54:50,927 -- neuca-quantum-add-iface 19076 DEBUG : QUANTUM_NET_NETWORK: data
2014-09-18 21:54:56,049 -- neuca-quantum-add-iface 19308 DEBUG : QUANTUM_NET_NETWORK: of-data
2014-09-18 21:54:56,424 -- neuca-quantum-add-iface 19331 DEBUG : QUANTUM_NET_NETWORK: of-data
2014-09-18 21:55:06,080 -- neuca-quantum-add-iface 19547 DEBUG : QUANTUM_NET_NETWORK: data
2014-09-18 21:55:07,859 -- neuca-quantum-add-iface 19564 DEBUG : QUANTUM_NET_NETWORK: vlan-storage
2014-09-18 22:02:43,279 -- neuca-quantum-add-iface 29933 DEBUG : QUANTUM_NET_NETWORK: vlan-data
2014-09-18 22:02:45,893 -- neuca-quantum-add-iface 30027 DEBUG : QUANTUM_NET_NETWORK: vlan-data
2014-09-18 22:02:46,421 -- neuca-quantum-add-iface 30056 DEBUG : QUANTUM_NET_NETWORK: vlan-data
2014-09-18 22:06:30,580 -- neuca-quantum-add-iface 4661 DEBUG : QUANTUM_NET_NETWORK: vlan-data
2014-09-18 22:06:52,310 -- neuca-quantum-add-iface 5081 DEBUG : QUANTUM_NET_NETWORK: vlan-data

Attachments

vm-bm-storage-ucd.rdf (15.2 kB) - added by anirban 5 years ago.
request with multiple storage with vms and baremetal

Change History

Changed 5 years ago by anirban

request with multiple storage with vms and baremetal

Changed 5 years ago by anirban

  • cc yxin added

Changed 5 years ago by yxin

Paul,

Could you please check if Quatum is configured correctly at UCD?

The user data received the correct IP address, however, the handler output says host interface name as "data", "of-data".. which do not sound right.

Changed 5 years ago by anirban

Just FYI, other slices worked fine with respect to interfaces. For example, the slice with vm attached to storage and a nodegroup, which was launched after the problematic slice, came up fine with the correct interfaces.

Changed 5 years ago by pruth

Quantum look to be configures correctly.

There are some VMs in a wedged state AND a lot of quantum networks (orphans?). These could be from clean restarts. UCD might need a clean restart with a manual cleanup.

Changed 5 years ago by yxin

There is an error in ucdvmsite.rdf that may contributes to the problem. I checked in the fix, please update the ref and test it again.

For some reason, I don't see UCD and TAMU site in my Flukes, hard to test...

I'll try to debug it more in unit test for now.

Changed 5 years ago by ibaldin

The likely issue is with network names either in RDF, or in quantum (should be on every worker node /etc/quantum/plugins/neuca) or in xcat (am/config/xcat.site.properties). I fixed a similar issue in TAMU RDF - baremetal node interface definition had wrong network name (data vs. vlan-data).

Changed 5 years ago by yxin

1. I found and fixed the problem w/ the repeated IP address on storage interfaces.
2. I found and fixed problem with ucdvmsite.rdf and tamuvmsite.rdf (not sure why it worked?)

please update and redeploy and test the complex topology. thanks!

Changed 5 years ago by ibaldin

  • status changed from new to closed
  • resolution set to fixed

This appears resolved and also Flukes (0.5-SNAPSHOT) has been updated to deal with new model features.

Note: See TracTickets for help on using tickets.